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Title: Method for preparing polypeptide variants 
FIELD OF THE INVENTION 

The present invention relates to a method for preparing polypeptide 
variants by in vivo recombination, 

BACKGROUND OF THE INVENTION 

The advantages of producing biologically active polypeptides by 
cloning naturally occurring DNA sequences from microorganisms, such 
as fungal organisms and bacteria using recombinant DNA technology 
have been known for quite some years. 

Preparation of novel polypeptide variants and mutants, such as novel 
modified enzymes with altered characteristics, e.g. specific activ- 
ity, substrate specificity, pH-optimum, pi, K„, V-.^^ etc., have 
especially during the recent years diligently and successfully been 
used for obtaining polypeptides with improved properties. 

For instance, within the technical field of enzymes the washing 
and/or dishwashing performance of e.g. proteases, lipases, amylases 
and cellulases have been improved significantly. 

In most cases these improvements have been obtained by site-directed 
m>utagenesis resulting in substitution, deletion or insertion of 
specific amino acid residues which have been chosen either on the 
basis of their type or on the basis of their location in the second- 
ary or tertiary structure of the m.ature enzyme (see for instance US 
patent no. 4, 518, 584) . 

An alternative general approach for m.odifying proteins and enzymes 
have been based on random mutagenesis, for instance, as disclosed in 
US 4^894,331 and WO 93/01285 

As it is a cumbersome and tim.e consuming process to obtain po- 
lypeptide variants or riutants with improved functional properties a 
few alternative methods for rapid preparation of modified 
polypeptides have been suggested. 
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Weber et al . , (1983), Nucleic Acids Research, vol 11, 5561-5661, 
describes a method for modifying genes by in vivo recombination 
between to homologous genes. A linear DNA sequence comprising a 
piasmid vector flanked to a DNA sequence encoding alpha-1 human 
interferon in the 5 ' -end and a DNA sequence encoding alpha-2 human 
interferon in the 3 ' -end is constructed and transfected into a rec A 
positive strain of E. coll. Recombinants wisre identified and 
isolated using a resistance marker. 

Pompon el al . , (1989), Gene 83, p. 15-24, describes a method for 
shuffling gene domains of mammalian' cytochrome P-450 by In vivo 
recombination of partially homologous sequences in Saccharomyces 
cerevislae by transforming Saccharomyces cerevisla with a linearized 
piasmid with filled-in ends, and a DNA fragment being partially 
homologous to the ends of said piasmid. 

Stemmer, (1994), Proc. Natl. Acad. Sci. USA, Vol. 91, 10747-10751; 
Stemmer, (1994), Nature, vol. 370, 389- 391, concern methods for 
shuffling homologous DN.^ sequences by an in vitro PCR method. One 
cycle of shuffling consists of digesting a pool of homologous genes 
with DNase I. The resulting small fragments are reassembled into 
full-length genes. Positive recorribinant genes containing shuffled 
DNA sequences are selected from a DNA library based on their 
improved function. Positive recombinants can be used as the starting 
material for (an)other shuffling round(s}. 

US patent no. 5,093,257 (Assignee: Genencor Int. Inc.) discloses a 
method for producing hybrid polypeptides by in vivo recombination. 
Hybrid DNA sequences are produced by forming a circular vector 
comprising a replication sequence, a first DNA sequence encoding the 
amino-terminal portion of the hybrid polypeptide, a second DNA 
sequence encoding the carboxy-terminal portion of said hybrid 
polypeptide. The circular vector is transformed into a rec positive 
microorganism in which the circular vector is amplified. This 
results in recombination of said circular vector mediated by the 
naturally occurring recombination mechanism of the rec positive 
microorganism, which include prokaryotes such as Bacillus and E. 
coll, and .eukaryotes such as Saccharomyces cerevislae. 
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Despite the existence of the above methods there are still need for 
even better iterative in vivo recombination methods for preparing 
novel positive polypeptide variants. 

SUMMARY OF THE INVENTION 

The object of the present invention is 
for preparing positive polypeptide 
recombination method. 

The inventor of the present invention have surprisingly found that 
such positive polypeptide variants may advantageously be prepared by 
shuffling different nucleotide sequences of homologous DNA sequences 
by in vivo recombination comprising the steps of 

a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid (s) within the DNA sequence (s) 
encoding the polypeptide (s ) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular plasmid(s), d) introducing at least one 
of said opened plasmid (s), together with at least one of said 
homologous DNA fragment (s) covering full-length DNA sequences 
encoding said polypeptide ( s ) or parts thereof, into a recombination 
host cell, 

e) cultivating said recombination host cell, and 

f) screening for positive polypeptide variants. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the yeast expression plasmid pJS026 comprising DNA 

sequence encoding the Humicola lanuginosa lipase gene. 

Figure 2 shows the yeast expression plasmid pJS037, comprising DNA 

sequence encoding the Humicola lanuginosa lipase gene containing 

twelve additional restriction sites. 

Figure 3 shows the plas.mid pJS026. 

Figure 4 shows the plasmid pJS037 . 

Figure 5 shows the in vivo recorJoinat ion of the 0.9 kb synthetic 
wild-type Humicola lanuginosa lipase with pJS037 using Saccharomyces 



to provide an improved method 
variants ~ By an in vivo 
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cerevisiae as the recombination host cell (described in Examole 1) . 
Figure 6 shows the In vivo recombination of a DNA fragment crepared 
from Humicola lanuginosa lipase variant (y) with Humicola lanuginosa 
lipase variant (d) comprised in a plasmid using Saccharomyces 
cerevisiae as the recombination host cell (described in Examole 2} . 
Figure 7 shows an overview over the location of the" inactivation 
site of the Humicola lanuginosa lipase gene and the number of the 
clone (referred to as "blue number" in the tables) . Location of 
restriction enzyme sites and clone numbers are relative to the 
initiation codon of the lipase gene. In ail cases a stop ccdon was 
located in the new reading frame 10 to 50 bp from the frameshift. 
Figure 8 shows an overview of the creation of active humicola 
lanuginosa lipase genes from the recorJoinations in table 2A and B 
by a "mosaic mechanism". Lines indicate the introduction of the 
fragment sequence into the vector and lines with a x indicate 
sequences that are not introduced in the active lipase colonies. 
The primers used for the PGR fragment are shown together with the 
location of the frameshift mutation (marked by the restriction site 
used for the construction) . 

Figure 9 shows an overview of fragments used in the recombination 
of 2 partial overlapping fragments into a gapped vector. The 
primers used fcr the PGR fragments are shown together with the 
location of the frameshift mutation (if not wild type) . 
Figure 10 shows an overview of fragments used in the recombination 
of 3 partial overlapping fragments into a gapped vector. The 
prim.ers used for the PGR fragments are shown. The overlap between 
PCR353 and 355 is only a 10 bp. 

DETAILED DESCRIPTION OF THE INVENTION 

The objecTi of the present invention is to provide an improved method 
for preparing positive polypeptide variants by an iterative in vivo 
recorrjDination method. 

The irtventor of the present invention have surprisingly found an 
efficient method for shuffling homologous DNA sequences in an in 
vivo recorrjDination system using a eu)caryotic cell as a reco.tbination 
host cell.. 
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A "recombinacion host cell" is in the context of the presen 
invention a cell capable of mediating shuffling of a number o. 
homologous DNA sequences. 

The term "shuffling" means recombination of nucleotide sequence (s) 
between two or more homologous DNA sequences resulting in output DN' 
sequences (i.e. DNA sequences having been subjecTBd to a shufflinc" 
cycle) having a number of nucleotides exchanged, in comoarison t.^ 
the xnput DNA sequences (i.e. starting point homologous ON^ 
sequences) . 

An important advantage of the invention is that mosaic DNA sequences 
witn multiple replacement points or replacements, no-, related to the 
opening site, is created, which is not discovered in Pomoon's 
method. 



An other important advantage of the present invention is that when 
using a mixture of fragments and opened vectors (in the screening 
set up) it gives the possibility of many different clones to 
recombme pairwise or even triplewise (as can be seen in a couole o^ 
examples below) . 



-.ne i.n vivo re cor^i^i nation rr.athod of the invention simole to p-^for- 
a.nd results in a high level of mixing of homologous genes or 
variants. A large n..i.-aber of variants . or , homologous genes can b» 
mixeo in one transf orrr.ation . The mixing of improved variants or wild 
type genes followed by screening increases the n-^er of further 
improved variants manyfold compared to doing only random 
mutagenesis . 

Recombination of multiple overlapping fragments is possible with a 
high efficiency increasing the r.ixing of variants or homologous 
genes using the in vivo recorrJomation method. An overlao as small as 
10 bp is sufficient for recorriination which may be utilized for very 
easy -domain shuffling of even distantly related genes. 

The invention relates to a method for preparing polypeptide variants 
by shuffling different nucleotide sequences ' of homologous DNA 
sequences by in vivo recombination co.mprising the steps of 
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a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid(s) within the DNA sequence (s) 
encoding the polypeptide { s ) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular plasmid(s), d) introducing at least one 
of said opened plasmid(s), together with at least one of said 
homologous DNA fragment (s) covering full-lerrg-tti- DNA sequences 
encoding said polypeptide { s ) or parts thereof, into a recombination 
host cell, 

e) cultivating said recombination host cell, and 

f) screening for positive polypeptide variants. 

According to the invention more than one cycle of step a) to f) may 
be performed. 

The opening of the plasmid (s) in step b) can be directed toward any 
site within the polypeptide coding region of the plasmid. The 
plamid{s) may be opened by any suitable methods known in the art. 
The opened ends of the plasmid may be filled-in with nucleotides as 
described in Pompon et al . (1989), supra). It is preferred not to 
fill in the opened ends as it might create a frameshift. 

It is preferred to open the plasmid (s) around the middle of the 
polypeptide coding DNA sequence (s), as this is believed to result in 
a more effective recombination between DNA fragment {s) and opened 
plasmid (s } . 

In an embodiment of the invention the DNA fragment (s) is (are) 
prepared under conditions resulting in a low, medium or high random 
mutagenesis frequency. 

To obtain low mutagenesis frequency the DNA sequence (s) (comprising 
the DNA fragment(s}) may be prepared by a standard PGR amplification 
method (US 4,683,202 or Saiki et al., (1988), Science 239, 487 - 
491)'. 

A medium or high mutagenesis frequency may be obtained by performing 
the PGR amplification under conditions which increase the mis- 
incorporation of nucleotides, for instance as described by Deshler, 
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(1992), GATA 9(4), 103-106; Leung et al., (1989), Technique, Vol. 1 
No. 1, 11-15. 

It is also contemplated according to the invention to combine th 
PGR amplification (i.e. according to this embodiment also DNi 
fragment mutation) with a mutagenesis step using a suitable physical 
or chemical mutagenizing agent, e.g., one which induces transitions, 
transversions, inversions, scrambling, deletions, and/or insertions. 

In the context of the present invention the term "positive poly- 
peptide variants" means resulting polypeptide variants possessinc 
functional properties which has be'en improved in comparison to the 
polypeptides producible from the corresponding input DNA sequences. 
Examples, of such improved properties can be as different as e.g. 
biological activity, enzyme washing performance, antibiotic resis- 
tance etc. 

Consequently, which screening method to be used for identifying 
positive variants depend on the desired improved proper-y of the 
polypeptide variant in question. 

If, for instance, the polypeptide in question is an enzyme and the 
desired improved functional property is the wash performance, the 
screening in step f) may conveniently be performed by use of a 
filter assay based on the following principle: 

The recombination host cell is incubated on a suitable m.edium and 
under suitable conditions for the enzyrr^e to be secreted, the mediu-m 
being provided with a double filter comprising a first protein- 
binding filter and on top of that a second filter exhibiting a low 
protein binding capability. The recombination host cell is located 
on the second filter. Subsequent to the incubation, the first filter 
com.prising the enzyme secreted from the recombination host cell is 
separated from the second filter cor.prising said cells. The first 
filter is subjected to screening for the desired enzymatic activity 
and -the corresponding microbial colonies present on the second 
filter are identified. 

The filteir used for binding the enzymatic activity may be any 
protein binding filter e.g. nylon or nitrocellulose. The topfilter 
carrying the colonies of the expression organism may be any filter 
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that has no or low affinity for binding proteins e.g. cellulose 
acetate or DuraporeO, The filter may be pre-treated with any of the 
conditions to be used for screening or may be treated during the 
detection of enzymatic activity. 

The enzymatic activity may be detected by a dye, f luorescencc/ 
precipitation, pH indicator, IR-absorbance or any other known 
technique for detection of enzymatic activity. 

The detecting compound may be immobilized by any immobilizing agent 
e.g. agarose, agar, gelatine, polyacrylamide , starch, filter paper, 
cloth; or any combination of immobilizing agents. 

If the improved functional property of the polypeptide is not 
sufficiently good after one cycle of shuffling, the polypeptide may 
be subjected to another cycle. 

In an embodiment of the invention at least one shuffling cycle is a 
backcrossing cycle with the initially used DNA fragment, which may 
be the wild-type DNA fragment. This eliminates non-essential muta- 
tions. Non-essential mutations may also be eliminated by using wild- 
type DNA fragments as the initially used input DNA material. 

It is to be understood that the method of the invention is suitable 
for all types of polypeptide, including enzymes such as proteases, 
a.T.yiases, lipases, cutinases, amylases, cellulases, peroxidases and 
oxidases . 

Also contemplated according to the invention is polypeptides having 
biological activity such as insulin, ACTK, glucagon, somatostatin, 
somatotropin, thymosin, parathyroid hormone, pigmentary hormones, 
somatomedin, erythropoietin, luteinizing hormone, chorionic 
gonadotropin, hypothalamic releasing factors, antidiuretic hormones, 
thyroid stimulating hormone, relaxin, interferon, thrombopoietin 
(TPO) and prolactin. 

Especially contemplated according to the present invention is 
initially to use input DNA sequences being either wild-type, variant 
or modified DNA sequences, such as a DNA sequences coding for wild- 
type, variant or modified enzymes, respectively, in particular 
enzymes exhibiting lipolytic activity. 
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In an embodiment of the invention the lipolytic activity is a lipase 
activity derived from the filamentous fungi of the Humlcola sp., ir. 
particular Hamicola lanuginosa, especially Humlcola lanuginosa. 

In a specific embodiment of the invention the initially used inpur 
DNA fragment to be shuffled with a homologous polypeptide is the 
wild-type DNA sequence encoding the. Humlcola lanuginosa lipase 
derived from Humlcola lanuginosa DSM 4109 described in E? 305 216 
(Novo Nordisk A/S) . 

Also specifically encompassed by the scope of the invention is inpu- 
DNA sequences selected from the group of vectors (a) to (f) and/or 
DNA fragments (g) to (aa) coding for Humlcola lanuginosa lipase 
variants from the list below in the Material and Method section. 

Throughout the present application the name Humlcola lanuginosa has 
been used to identify one preferred parent enzyme, i.e. the one 
mentioned immediately above. However, in recent years H. lanuginosa 
has also been termed Thermomyces lanuglnosus (a species introduced 
the first time by Tsiklinsky in 1989) since the fungus show 
morphological and physiological similarity to Thermomyces 
lanuglnosus , Accordingly, it will be understood that whenever 
reference is made to H. lanuginosa this term could be replaced by 
Thermomyces lanuglnosus . The DMA encoding part of the IBS ribosomal 
gene from Thermomyces lanuglnosus (or H. lanuginosa) have been 
sequenced. The resulting 18S sequence was compared to other 18S 
sequences in the GenBank database and a phylogenetic analysis using 
parsimony (PAUP, Version 3.1.1, Smithsonian Institution, 1993) have 
also been made. This clearly assigns Thermomyces lanuglnosus to the 
class of Piectomycetes, probably to the order of Eurotiales . 
According to the Entrez Browser at the NC3I (National Center for 
Biotechnology Information), this relates Ther/no/nyces lanuglnosus to 
families like Eremascaceae, Monoascaceae , Pseudoeurotlaceae and 
Trichocomaceae, the latter containing genera like Emerlcella, 
Aspergillus, Penicillium, Eupenicillium, Paecilom'yces , Talaromyces, 
Thermoascus and Scleroclelsta . 

Consequently, such genes encoding lipolytic enzymes of f ila-mentous 
fungi of the genera Emericella , Aspergillus, Penlcllllum, 
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Eupenicillium, Paecilomyces, Talaromyces, Thermoascus and 
Sclerocleista are also specifically contemplated according to the 
present invention. 

5 Other examples of relevant filamentous fungi genes encoding 
lipolytic enzymes include strains of the Absidia sp. e.g. the 
strains listed in WO 96/13578 (from Novo Nordisk A/S) which are 
hereby incorporated by reference. Absidia sp. strains listed in WO 
96/13578 include Absidia blakesleeana , Absidia" '"corymbif era and 

3 Absidia reflexa. 

Strains of Rhizopus sp., in particular Rh. niveus and Rh. oryzea are 
also contemplated according to the invention. 

The lipolytic gene may also be derived from a bacteria, such as a 

strain of the Pseudomonas sp., in particular Ps. fragi, Ps . 

stutzeri, Ps. cepacia and Ps. fluorescens (WO 89/04361), or Ps . 
plantarii or Ps. gladioli (US 4,950,417) or Ps. alcaligenes and Ps. 
pseudoalcaligenes (EP 218 272, E? 331 376, or WO 94/25578 

(disclosing variants of the Ps. pseudoalcaligenes lipolytic enzyme), 
the Pseudomonas sp. variants disclosed in EP 407 225, or a 
Pseudomonas sp. lipolytic enzyTr:e, such as the Ps. mendocina (also 
termed Ps. putida) lipolytic enzyme described in WO 88/09367 and US 
5,389,536 or variants thereof as described in US 5,352,594, or Ps. 
aurogmosa or Ps. glumae, or Ps . syringae, or Ps. wisconsinensis (WO 
96/12012 from Solveyj or a strain of Bacillus sp., e.g. the B. 
subtilis described by Dartois et al., (1993) Bioche.mica et 
Biophysica acta 1131, 253-260, or B. stearothermophilus (j? 
64/7744992) or B. pumilus (WO 91/16422) or a strain of Streptomyces 
s?., e.g. s. scabies, or a strain of Chromobacterium sp. e.g c. 
viscosum. 

In connection with the Pseudomonas sp. lipases it has been found 
that lipases from the following organisms have a high degree of 
homology, such as at least 60% hc~.ology, at least 801 homology or at 
least- 901 homology, and thus are contemplated to belong to the same 
family of lipases: " Ps . ATCC21808, Pseudomonas sp. lipase 
corrjuercially available as Liposam®, Ps . aeruginosa EF2, Ps. 
aeruginosa PkClK, Ps. aeruginosa PAOl, Ps. aeruginosa TE 3285, Ps. 
sp. 109, Ps. pseudoalcaligenes Ml, Ps. glumae, Ps . cepacia DSM 3959, 
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P5. cepacia M-12-33, Ps, sp. KWI-56, Ps. putida IFO 3458, P5. putida 
IFO 12049 (Gilbert, E. J., (1993), Pseudomonas lipases: Biochemical 
properties and molecular cloning. Enzyme Microb. Technol., 15, 634- 
645) . The species Pseudomonas cepacia has recently been reclassified 
5 as Burkholderia cepacia r but is termed Ps , cepacia in the presen* 
application . 

Also genes encoding lipolytic enzymes from yeasts are relevant, ans 
include lipolytic genes from Candida sp., in ^ particular Candies 
10 rugosa, or Geotrichum sp. , in particular Geotrichum candidum. 

Specific examples of microorganisms comprising genes encoding 
lipolytic enzymes used for commercially available products and which 
may serve as donor of genes to be shuffled according to the 
15 invention include Humicola lanuginosa, used in Lipolase(S), LipoiasetS 
Ultra, Ps. mendocina used in Lumafast®, Ps. alcaligenes used ir. 
Lipomax®, Fusarium solani, Bacillus sp. (US 5427936, E? 528828), 
Ps. mendocina ^ used in Liposam®. 

20 Also the Pseudomonas sp. lipase gene shown in SEQ ID MO 14 are 
specifically contemplated according to the invention. 

It is to be err.phasized that genes encoding lipolytic enzyme to be shufflec 
according to the invention nay be any of the above mentioned genes of 

2 5 lipolytic enzyr.es and any variant, modification, or truncation thereof. 
Exanpies of such genes which are specifically conteniplated include the 
genes encoding the enzyrr.es described in WO 92/05249, WO 94/01541, KC 
94 /14 951, VIO 94/25577, V?0 95/22615 and a protein engineered lipase variants 
as described in Z? 407 225; a protein engineered Ps. mendocina lipase as 

30 described in US 5,352,594; a cutinase variant as described in VIO 94/14964; 
a variant of an Aspergillus lipolytic enzyme as described in EP patent 
167,309; and Pseudomonas sp. lipase described in WO 95/06720. 

A request to the DNA sequences, encoding the polypeptide (s ) , to be 
35 shuffled, is that they are' at least 60%, preferably at least 70%, 
bett;er more than 80%, especially more than 90%, and even better up 
to almost 100% homologous. DNA sequences being less homologous will 
have less inclination to interact and recombine. 



40 



It is also contemplated according to the invention to shuffle parent 
(homologous) wildt type organisms of different genera. 
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Further, the DNA fragment (s) to be shuffled may preferably have a 
length of from about 20 bp to 8 kb, preferably about 40 bp to 6 kb, 
more preferred about 80 bp to 4 kb, especially about 100 bp to 2 kb, 
to be able to interact optimally with the opened plasmid. 

The method of the invention is very efficient for preparing po- 
lypeptide variants in comparison to prior art method comprising 
transforming linear DNA fragments/sequences. 

The inventor found that the transformation frequency of a mixture of 
opened plasmid and a DNA fragment were significantly higher than 
when transforming a plasmid cut at the same site alone. The trans- 
formation frequency of the opened plasmid and DNA fragment were as 
high as for. uncut plasmid. 

Without being limited to any theory it is believed that the opening 
of the plasmid(s) restrict{s) the replication of (opened) plasmid(s) 
when not interacting with at least one DNA fragment. In accordance 
with this an increased number of recombined DNA sequences were found 
after only one shuffling cycle. 

As described in Example 1 50% of the resulting transf ormants 
contained recombined DNA sequences of both input DNA sequences. As 
high as 20% of the total n'^±)er of recombined DNA sequences were 
"random" mixtures (i.e. ha\'ing more than one region of nucleotides 
exchanged) . 

The input DNA sequences may be any DNA sequences including wild-type 
DNA sequences, DNA sequences encoding variants or mutants, or 
modifications thereof, such as extended or elongated DNA sequences, 
and may also be the outcome of DNA sequences having been subjected 
to one or more cycles of shuffling {i.e. output DNA sequences) 
according to the method of the invention or any other method (e.g. 
any of the methods described in the prior art section) . 

When using the method of the invention the output DNA sequences 
(i.e. shuffled DN.^ sequences), have had a number of nucleotide ( s) 
exchanged. This results in replacement of at least one amino acid 
within the polypeptide variant, if comparing it with the parent 
polypeptide. It is to be understood that also silent mutations is 
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contemplated (i.e. nucleotide exchange which does not result in 
changes in the amino acid sequence) . 

However, the method of the present invention will in most cases lead 
to the replacement of a considerable number of amino acid and may in 
certain cases even alter the structure of one or more polypeptide 
domains (i.e. a folded unit of polypeptide structure). 

According to the present invention more than two^ T5NA sequences are 
shuffled at the same time. Actually any number of different DNA 
fragments and homologous polypeptides comprised in suitable plasmids 
may be shuffles at the same time.' This is advantageous as a vast 
number of quite different variants can be made rapidly without an 
abundance of iterative procedures. 

The inventor have tested the nucleotide shuffling method of the 
invention using significantly more than two homologous DNA 
sequences. As described in Exa.-nple 2 it was surprisingly found that 
the method of the invention advantageously can be used for 
reco.-nbining more than two DNA sequences. 

One cycle of shuffling according to the method of the invention may 
result in the exchange of from 1 to 1000 nucleotides into the opened 
plasmid DNA sequence encoding the polypeptide in question. The 
exchanged nucleotide sequence (s) may be continuous or may be present 
as a namber of sub-sequences within the full-length sequence(s). 



To support the present invention the inventor made a n^jmber of 
additional experiments on different aspect on the method of the 
invention. The experiments are described below and illustrated in 
the Example 3 to 6 below. 

A nu-mber of vectors and fragir.ents comprising an inactivated 
synthetic Humicola lanuginosa lipase genes were constructed by 
introducing f rameshif t/stop codon r.utations in the lipase gene at 
various positions. These were used for monitoring the in vivo 
recombination of different combinations of opened vector (s) and DNA 
frag.ments. The number of active lipase colonies were scored as 
described in Example 3. The number of colonies determines the 
efficiency of the opened vector (s) and fragment (s) recombination. 



wo 97/07205 



14 



PCT/DK96/00343 



One frameshift mutation in said Humicola lanuginosa lipase gene in 
the opened vector and another in the fragment on the opposite side 
of the opening site gave 3 to 32% of active lipase colonies 
depending on the location and combination. It was concluded that ' 
the closer that the mutation is at the ends of the vector the 
higher mixing. 

One frameshift mutation in the opened vector and two in the 
fragment on each side of the opening site gave A'to^-ts2% of 'active 
colonies depending on the location and combination. Some of these 
active colonies can be considered to be mosaics, not only related 
to the opening site. . " 

Two frameshift mutations in the opened vector on each side of the 
opening site and one in the fragment gave 0.5 to 2.1% of active 
colonies depending on the location and combination. Most of'these 
active colonies are mosaics of the "parent" DNA. 

Two frameshift mutations in the opened vector on each side of the 
opening site and a wild type fragment gave 7 . 7 to 10.7% of active 
colonies depending on the location. 

It was also found that the amount of vectors relative to fragments 
a.nd the size of the fragnents are also influencing the result. 

Using of the S. cerevisiae r2d52 mutants as the recombination host 
cell showed that the rad52 mutant transformed very well with wild 
type plasmid(s) and expressed the Humicola lanuginosa lipase gene, 
but gave no transf ormants at all with the opened vectors and 
fragments . 

The P.z^D52 function is required for "classical recombination" (but 
not for unequal sister-strand mitotic recombination) showing that 
the recombination of opened vector and fragment could involve a 
classical recombination mechanisr.. 

Classical recombination is the recombination mechanism involved in 
the recombination between genes located on nonsister chromatids of 
homologous chromosomes as defined in for example Petes TD, Malone 
RE and Symington LS (1991) ".Recombination in Yeast", page 407-522, 
in The Molecular and Cellular Biology of the Yeast Saccharo.-Tiyces , ' 
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Volume 1 {eds. Broach JR, Pringle JR and Jones EW) , Cold Spring 
Harbor Laboratory Press, New York. 

Multiple partially overlaPDing fraqements 

The inventor also tested recombination of multiple partial 
overlapping fragments using the method of the invention. 

The recombination of 2 and 3 partial overlapping fragments into a 
gapped (i.e. that the opening result in cutting""ouC' of a little 
part of the gene) vector were tested and gave a high recovery of 
recombined Humlcola lanuginosa lipase gene. The recovery of active 
lipase gene from different combinations of inactivated Humlcola 
lanuginosa genes was tested for the recombination of 2 partial 
overlapping fragments. The tendency was a higher mixing in the 
overlapping region between the 2 fragments in the gapped region 
than in the vector and fragment overlap. 

When recombining many fragments from the same region, the multiple 
overlapping fragment technique will increase the mixing by itself, 
but it is also im.portant to have a relative high random mixing in 
overlapping regions in order to mix closely located 
variants /differences . 

An overiao as small as 10 bp between two fragm.ents were found to be 
sufficient to obtain a very efficient reco.Tibina tion . Therefore, 
overlapping in the range from 5 to 5000 bp, preferably from 10 bp to 
500 bp, especially 10 bp to 100 bp is suitable according to the 
method of the invention. 

According to this embodiment of the present invention 2 or more 
overlapping fragments; preferable 2 to 6 ' overlapping fragments, 
especially 2 to 4 overlapping fragments may advantageously be used 
as input fragments in a shufflinc cycle. 

Besides increasing the mixing of genes, this is a very useful 
method for domiain shuffling by creating small overlaps between DNA 
fragments from different domains and screen for the best 
comJoination . 

For instance, in the case of three DNA fragments the overlapping 
regions may be as follows: 
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- the first end of the first fragment overlaps the first end of the 
opened plasruid, 

- the first end of the second fragment overlaps the second end of 
the first fragment, and the second end of the second fragment 
overlaps the first end of the third fragment, 

- the first end of the third fragment overlaps (as stated above) the 
second end of the second fragment, and the second end of the third 
fragment overlaps the second end of the opened plasmid. 

It is to be understood that when using two or more DNA fragments as 
starting material it is preferred to have continuos overlaps between 
the ends of the plasmid and the DNA - fragments . 

Even though it is preferred to shuffle homologous DNA sequences in 
the form of DNA fragment (s) and opened plasmid (s), it is also 
contemplated according to the invention to shuffle two or more 
opened plasmids comprising homciogous DNA sequences encoding 
polypeptides. However, in such case it is compulsory to open the 
plasmids at different sites. 

In an further embodiment of the invention two or more opened 
plasmids and one or more homologous DNA fragments are used as the 
starting material to be shuffled. The ratio between -the opened 
piasmid{s) and homologous DNA f ragn^.ent { s ) preferably lie in the 
range from 20:1 to 1:50, preferable from 2:1 to 1:10 (mol vector:mol 
fragments) with the specific concentrations being from 1 pM to 10 M 
of the DNA. 



The opened plasmids may advantagously be gapped in such a way that 
the overlap between the fragments is deleted in the vector in order 
to select for the recombination) . 
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Preparing the DNA fragment 

The DNA fragment to be shuffled with the homologous polypeptic 
comprised in an opened plasmid may be prepared by any suitabl 
method. For instance, the DNA fragment may be prepared by PC; 
amplification (polymerase chain reaction), as described above, of a 
plasmid or vector comprising the gene of the polypeptide, usinc 
specific primers, for instance as described in US 4,683,202 or Saiki 
et al., (1988), Science 239, 487 - 491. The DNA fragment riay also be 
cut out from a vector or plasmid comprising the" desired DN'A sequence 
by digestion with restriction enzymes, followed by isolation usinc 
e.g. electrophoresis. 

The DNA fragment encoding the homologous polypeptide in question may 
alternatively be prepared synthetically by established standard 
methods, e.g. the phosphoamidi te method described by Beaucage and 
Caruthers, (1981), Tetrahedron Letters 22, 1859 - 1869, or the 
method described by Matthes et ai., (1984), EM30 Journal 3, 801 - 
805. According to the phosphoair.idite method, oligonucleotides are 
synthesized, e.g. in an automatic DNA synthesizer, purified, 
annealed, iigated and cloned in suitable vectors. 

Furthermore, the DNA fragment may be of mixed synthetic and genomic, 
mixed synthetic and cDN.^ or mixed genomic and cDNA origin prepared 
by ligating fragments of synthetic, genomic or cONA origin (as, 
appropriate), the fragments corresponding to various parts of the 
entire DNA sequence, in accordance with standard techniques. 

The plasmid 

The plasmid comprising the DNA sequence encoding the polypeptide in 
question may be prepared by ligating said DNA sequence into e 
suitable vector or plasmid, or by any other suitable method. 

Said vector may be any vector which may conveniently be subjected tc 
recorrJDinant DNA procedures. The choice of vector will often depend 
on the recorrJoi nation host cell ir.zo which it is to be introduced. 

Thus, the vector may be an autonomously replicating vector, i.e. a 
vector which exists as an extrachromosomal entity, the replication 
of which is independent of chromosomal replication, e.g. a plasmid. 
.Alternatively, the vector may be one which, when introduced into the 
recombination host cell, is integrated into the host cell genome and 
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replicated together with the chromosome (s) into which it has been 
integrated. 

To facilitate the screening process it is preferred that the vector 
is an expression vector in which the DNA sequence encoding the 
polypeptide in question is operably linked to additional segments 
required for transcription of the DNA. In general, the expression 
vector is derived from a plasmid, a cosmid or a bacteriophage, or 
may contain elements of any or all of these. 

The term, "operably linked" indicates that the segments are arranged 
so that they function in concert for their intended purposes, e.g. 
transcription initiates in a promoter and proceeds through the DNA 
sequence coding for the polypeptide in question. 

The promoter may be any DNA sequence which shows transcriptional 
activity in the recombination host cell of choice and may be derived 
from genes encoding proteins, such as enzymes, either homologous or 
heterologous to the host cell. 



Examples of suitable promoters for use in yeast host cells include 
promoters from yeast glycolytic genes {Hitzeman et al.,{1980), J. 
Biol. Chem. 255, 12073 - 12080; Alber and Kawasaki, (1982), J. Mol. 
Appl. Gen. 1, 419 - 4 34) or alcohol dehydrogenase genes (Young et 
in Genetic Engineering of Microorganisms for Chemicals 
(Hollaender et al, eds . ) , Plenu^T; Press, New York, 1982), or the TPIl 
(US 4,599,311) or ADK2-4c (Russell et al., (1983), Nature 334, 652 - 
654) promoters. 

30 Examples of suitable promoters for use in filamentous fungus host 
cells are, for instance, the ADH3 promoter (McKnight et al., (1985), 
The EMBO J. 4, 2093 - 2099) or the tpiA promoter. Examples of other 
useful promoters are those derived from the gene encoding A. oryzae 
TAKA amtylase, Rhizomucor miehel aspartic proteinase, A. nlger neu- 

35 tral a-amylase, A. nlger -acid stable a-amylase, A. nlger or A, 
awamorl glucoamylase (gluA) , Rhizomucor miehel lipase, A. oryzae 
alkaline protease, A. oryzae triose phosphate isomerase or A. 
nldulans acetamidase. Preferred are the TAKA-amylase and gluA 
promoters 

40 
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The DNA seauence encoding polypeptide in question invention may 
Lso if nec'essary, be operably connected to a suitable ter..nator, 

u K^^nnP terminator (Palriiter et al., od^ 
such as the human growth hormone termina ^ 

, Hn^tO the TPIl (Alber and Kawasaki, oo^ cit^) 
cit ) or (for fungal hosts) tne 

. , . =1 on cit.) terminators. The vector may 
or ADH3 (McKnight et al., oo^ £i£^' t „ fm- 

• ^^rr.nr^ such as polvadenylation signals (e.g. iro^ 

further comorise elements sucn as f . ... v, 

- ^ • c ^ Fib region), transcriptional enhancer 
•5740 or the adenovirus 5 EID regiui j , . , ^ 

Inces (e... the SV.O enhancer, and -ns-tional enhancer 
sequences ie.g. the ones encoding adenovirus VA -KNA.-) . 

The vector may further comprise a DNA sequence enabling the vector 
to replicate in the recombination h'ost cell in question. 
When the host cell is a yeast cell, suitable sequences enaoling the 
vector to replicate are the yeast plasmid 2m replication genes EE. 
1-3 and origin of replication. 

• ■ vT r^n b- used for oroduction of useful proteins and 

The plasmiG pYl can oe useu i.^ . 

■ ^-i^^^n-nns -unq^ such as Aspergillus sp., anc 

oeotides, using filamentous -^n^-, 

w - hocr rpiis (JP06245777-A) . 
yeasts as recoirimant host ce--s i^-w 

a selectable marker, e.g. a gene the 
The vector may also comprise a sexe ^ 

, defect in the recombination host 
product of which complements a de.e-t 

^ . .^^^-r. fo- d^hydrofolate reductase (DHFR) or 

rell such as the gene coding to. a-nyua. 

' TPi qene (described by P.P.. Russell, 

the Schizosaccharomyces po..^e i-i ge^- 

(1935) , Gene 40, 125-130) . 

c ^.,^-abie selective markers are the ura3 and 

Another exaraole of sucn sui^a^-e i-x^ , ^ „ 

' . . ,„-,»nr^ tb- co-resDonding defect genes of e.g. 

lou2 genes which complements tne co-tea. 

the yeast strain Saccharomyces cerevisiae YNG318. 

^ .ISO como-se a selectable marker which confers 

Th° vector mav also comp^-ac 

rfr^^ e r. am.oiciUin, kanamycin, tetracyclin, 
resistance to a drug, e.-. . 

v,,,,,r-o-vc^n or methotrexate. For fi- 
chloramohenicol, neomycin, hygro..,yc-n „ „:^n 

, r-AT-ye-s include amdS , pvrG , aroB, niaD, 
lamentous fungi, selectable marKe.s -n 

5 sC^ trpC, Dvr4, and DHFR. 

► -in nuestion into the secretory pathway of 
To direct the polypeptide m queswion 

CM .eco,*i„=..on ho.= cell, . secretory sequence ,.l=oJ.n„„ 

3 le.d„ sequence, prepro sequence or pre sequence, n,ay b 
0 provided in the recombinant vector. The secretory sr,n.l "^enc, 

■joined to tne seouence encodin, tn. lipolytic enry^e rn the 
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correct reading fra:ne. Secretory signal sequences are con^only 
positioned 5> to the DNA sequence encoding the polypeptide Th^ 
secretory signal sequence nay be the signal normally associated with 
the polypeptide in question or may be from a gene encoding another 
5 secreted protein. 

The signal peptide may be naturally occurring signal peptide or a 
functional part thereof, or it may be a synthetic peptide For 
secreticr. from yeast cells, suitable signal peptide have been found 
to be tne a-factor signal peptide (cf. us 4,870,008), the signal 
peptide of mouse salivary amylase (cf. 0. Hagenbuchle et al 
(1981,, Nature 289, 60-646,, a modified carboxypeptidase signal 
peptide (Cf. L.A. Vails et al . , (1987), Cell 48, 887-897) the" 
Humzcola lanuginosa lipase signal peptide, the yeast BARl sianal 
' peptide ,cf. WO 87/02670), or the yeast aspartic protea;r"3 (Jp3, 
signal peptide (cf. M. Egel-Mitani et al., (1990), yeast 6 127 



For efricient secretion in yeast, a sequence encoding a leade^ 
peptide nay also be inserted downstream of the signal secuence and 
upstrea.-n of the DNA sequence encoding the polypeptide in 'ou^stion 
The runction of the leader peptide is to allow the exoressed 
.oolypeptide to be directed from the endoplasmic reticulum "to the 
Golgi apparatus and further to a secretory vesicle for 



into tne culture medio.. U.e. exportation of the polypeptide across 
Che ce.. wall or at least through the cellular membrane irto tho 
periplasmic space of the yeast cell). The leader peptide may b= th^ 
yeast a-iactor leader (the use of which is described ^-n e a us 
4,546, 082, E? 16 201, EP 123 294, EP 123 544 and EP 163'529) 
Alternatively, the leader peptide may be a synthetic leader peotid» 
which IS to say a leader peptide not found in nature. Synthetic 
eader peptides may, for instance, be constructed as described in WO 
89/0246J or WO 92/11378. 

For use in filamentous fungi, the signal peptide may conveniently be 
derived from a gene encoding an Aspergillus- sp. amylase o- 
glucoamylase, a gene encoding a Rhizomucor miehei lioase o- 
protease, a Humicola lanuginosa lipase. The signal peptide is 
preferably derived from a gene encoding A. oryzae TAKA amylase, A 
niger neutral a-amylase, A. niger acid-stable amylase, or A niger 
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glucoamylase . 

The recombination host cell 

The recombination host cell, into which the mixture of plas- 
5 mid/fragment DNA sequences are to be introduced, may be any 
eukaryotic cell, including fungal cells and plant cells, capable of 
recombining the homologous DNA sequences in question. 

According to prior art prokaryotic microorganisms> -such as bacteria 
10 including Baciiius and E. coli; eukaryotic organisms, such as 
filamentous fungi, including Aspergillus and yeasts such as 
Saccharomyces cerevisiae; and tissue culture cells from avian or 
mammalian origins have been suggested for in vivo recombination. Ail 
of said organisms can be used as recombination host cell, but in 
15 general prokaryotic ceils are not sufficiently effective (i.e. does 
not result in a suff icient ■ number of variants) to be suitable for 
recombination methods for industrial use. 
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Consequently, preferred recon'bina tion host cells according to the 
present invention are fungal cells, such as yeast cells or filament- 
ous fungi . 

Examples of suitable yeast cells include cells of Saccharonyces sp., 
in particular strains of Saccharomyces cerevisiae or Saccharomyces 
kluyveri or Schizosaccharomyces sp., Methods for transforming yeas- 
ceils with heterologous DNA and producing heterologous polypeptides 
therefrom are described, e.g. in US 4,599,311, US 4,931,373, US 
4,870, 008, 5,037, 743, and US 4,845,075, all of which are hereby- 
incorporated by reference. Transformed cells may be selected by, 
e.g., a phenotype determined by a selectable marker, commonly drug 
resistance or the ability to grow in the absence of a particular 
nutrient, e.g. leucine. A preferred vector for use in yeast is the 
POTl vector disclosed in US 4,931,373. The DNA sequence" encoding the 
polypeptide may be preceded by a signal sequence and optionally a 
leader sequence, e.g. as described above. Further examples of 
suitable yeast cells are strains of Kluyveromyces, such as K. 
lactis, Hansenula, e.g. H. polymorpha, or Pichia, e.g. P. pastoris 
(cf. Gleeson et al.,(1986), J. Gen. Microbiol. 132, 3459-3465; US 
4, 882, 279) . 



40 
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Examples of other fungal cells are cell, of filamentous fungi e a 
Aspergillus sp., Neurospora so., Fusarlu. sp. or Trichodema sp " : 
particular strains of oryz.e, nid.ian. or ni^er. The us^ of 
Aspergillus sp. for the expression of proteins is described in 
e.g., EP 272 277, ep 230 023. The transformation of r. o.ysoor^, 
may, for instance, be carried out as described by Malardie- et' al 
(1989), Gene 78, 147-156. • ' 



embodiment of the invention the- recombination h: 

particular 5, 



n ^^11 • -**^-^^i^mDination hos 

U cell IS a cell of the genus Saccharomyces , in 



cerevisiae . 



METHODS AND ^aTERIALS 

DNA sequence: 

Humicola lanuginosa DSM 4109 derxved lipase encoding DNA sequence. 
Humicola lanuginosa lipase variants: 

Y^liints used for oreoarinn vectors to be nn.n.. 

Example 2: ' — 

(a) E56R,D57L, I90r,D96L,E99K 

(b) E56R,D57L, V60M,D62N,SS3T, D95?,D102E 

(c) D57G;N94K,D96L,L97M 

(d) E87K,G91A, D96R, IIOOV, £129:-:, K237M, I252L, ?256T,G263A, L264Q 

(e) -56R.D57G,S58r.D62C.T64R,E87G,G91A,F95L,D96P,K98I, (K237M) 

(f) E210K 

variants used for oreoaring DNA fraoM.nr s by st.nH..H ... 

amplification in Example 2: ' ' ' ' ~ 

(g) S83T,N94K, D96N 

(h) E87K,D96V 

(i) N94K,D96A 

(j) E87K,G91A, D9dA 
(k) D167G,E210V 
(1) S83T,G91A,Q249R 
(m) E87K,G91A 

(n) S83T,E87K,G91A,N94K, D96N, DlllN. 
(o) N73D,E87K,G91A,N94I,D96G. 

(p) L67P,I76V,S83T,E87N,I90N,G91A, D96A,K98R. 
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(q) S83T,E87K,G91A,N92H,N94K,D96M 

(s) S85P,E87K,G91A, D96L,L97V. 

(t) E87K,I90N,G91A;N94S,D96N, HOOT. 

(u) I34V,S54P,F80L,S85T,D96G,R108W,G109V,D111G,S116P,L124S, 
V132M,V140Q,V141A,F142S.H145R,N162T,I166V,F181P,F183S, 

R205G, A243T,D254G;F262L. 
(V) E56R,D57L,I90F,D96L,E99K 
(X) E56R,D57L,V60M,D62N,S83T,D96P,D102E 
(y) D57G,N94K, D96L,L97M 

(2) E87K,G91A,D96R,I100V,E129K,K-237M,I252L,P256T,G263A,L264Q 
(aa) E56R,D57G,S58F,D62C,T64R,E87G,G91A,F95L,D96P,K98I 

Strains : 

Expression sysnem host: 

Saccharowyces cerevisiae YNG31S: KATa Dpep4(cir'l ura3-52, Ieu2-D2, 
his 4-539 

Saccharomyces cerevisiae Rad52: Strain M1533 = MATa rad52 ura3. 
obtained from Torsten Nilsson Tillgren, Institute of Genetics, 
University of Copenhagen. 



Plasmids : 

pJS026 (see figure 3) 
pJS037 (see figure 4) 
pYES 2.0 (Invitrogen) 

Transformation selective nar^er 

ura3 

leu2 



Media 

SC-ura": 90 ml 10 x Basal salt, 22.5 ml 20% casamino acids, 9 mi 
tryptophan, H:0 ad 806 ml, autoclaved, 3.6 ml 5% threonine and 90 
20% glucose or 20% galactose added. 

LB-medium: 10 g Bacto- tryptone , 5 g Bacto yeast extract, 10 g N 
in 1 litre water. 

Bril-liant Green (BG) (Merck, art. No. 1.01310) 

BG-reagent: 4 mg/mi Brilliant Green (BG) dissolved in water 

Substrate 1; 

10 ml olive oil (Sigma CAT NO. 0-1500) 

20 ml 2% polyvinyl alcohol (PVA) 
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The Substrate is homogenised for 15-20 minutes. 
Methods : 

Construction of veast expression vector 

The expression plasmids pJS026 and pJS037, are derived from pYES 
2.0. The inducible G.z^Ll -promoter of pYES 2.0 was replaced with" the 
constitutively expressed TPI (triose phosphate isomerase) -promoter 
from Saccharomyces cerevisiae (Albert and KcirwasakL>- (1982) , j. MoI 
Appl Genet., 1, 419-434), and the ura3 promoter has been deleted. A 
restriction map of pJS026 and pJS037 is shown in figure 3 and figure 
4, respectively. 

Preparation of the wild-type DNA fragment 

A lipase wild-type DNA fragment can be prepared either by PCR 
amplification (resulting in low, medium or high mutagenesis), of the 
PJS02 6 plasmid or by cutting the DNA fragment out by digesting with 
a suitable restriction enzyme. 

Fermenta tion of Humicola lanuginosa lipase variants, in veasr 
10 ml of SC-ura- medium is i.noculaced with a S. cerevisiae colony 
and grown at 30°C for 2 days. The 10 ml is used for inoculating 300 
ml SC-ura- medium which is grown at 30 °C for 3 days. The 300 ml is 



used for 


inoculation 5 1 of the following G-substrate 


400 c 


Ami case 


6.7 g 


yeast extract (Difco) 


12.5 g 


L-Leucin (fiuka) 


6.7 g 


(NH,)2S04 


10 g 


MgS0r7H2O 


17 g 


K2S04 


10 ml 


Trace compounds 


5 ml 


Vitamin solution 


6.7 ml 




25 mi 


20^ Pluronic (antifoam) 


la a toza 


1 volaT.e of 5000 ml : 



The yeast cells are fermented for 5 days at SO'C. They are given a 
start dosage of 100 ml 70% glucose and added 400 ml 70% glucose/day. 
A p.H=5.C is Icept by addition of a 10% NHj solution. Agitation is 300 
rpm for the first 22 hours followed by 900 rpm for the rest of the 
fermentation. Air is given with 11 air/l/min for the first 22 hours 
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followed by 1.5 1 air/l/min for the rest of the fermentation. 



Trace compounds : 
6.8 g ZnCl2 
5 54.0 g FeCl2-6H20 
19.1 g MnCl2-4H20 
2.2 g CuS04*5H20 
2.58 g C0CI2 
0.62 g H3BO3 
10 0.024 g (NH4) 6Mo702r4H20 

0. 2 g KI 

100 ml HCl (concentrated) 
In a total volume of 1 1. 

1 5 Vitamin solution: 

250 mg Biotin 

3 g Thiamin 

10 g D-Calciumpanthetonat 

100 g Myo-Inositol 

20 50 g Cholinchlorid 

1 . 6 g Pyridoxin 
1.2 c Niacinamid 
0.4 g Folicacid 
0.4 g Riboflavin 

25 In a total volume of 11. 



Transformation of yeast 

Saccha romyces cerevisiae is transfonned by standard methods (cf. 
Sambrooks et al., (1989), Molecular Cloning: A Laboratory Manual, 
30 2nd Zd.r Cold Spring Harbor) 

Determination of yeast transformation frequency 

The transformation frequency is determined by cultivating the 
transformants on 5C-ura"plates for 3 days and counting the namber of 
35 colonies appearing. The number of transformants per m.g opened 
plasmid is the transformation frequency. 

Screenina for p ositive variants with improved wash performance 
The following filter assay can be used for screening positive 
40 variants with improved wash per f ormiance . 
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Low calcium filter assay 

1) Provide SC Ura' replica plates (useful for selecting strains 
carrying the expression vector) with a first protein binding filter 
(Nylon membrane) and a second low protein binding filter (Cellulose 
acetate) on the top. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
lipase gene on the double filter and incubate for 2 or 3 days at 
30 'C. 

3) Keep the colonies on the top filter by transferring the top- 
filter to a new plate. 

4) Remove the protein binding filter to an empty petri dish. 

5) Pour an agarose solution comprising an olive oil emulsion (2% 
PVA:olive oil=3:l), Brilliant green (indicator , 0 , 004% } , 100 mM tris 
buffer pH9 and EGTA (final concentration 5mM) on the bottom filter 
so as to identify colonies expressing lipase activity in the form of 
blue-green spots . 

6) Identify colonies found in step 5) having a reduced dependency 
for calcium as compared to the parent lipase. 

DNA sequencing was performed by using applied Biosystem.s ABI DNA 
sequence model 373A according to the protocol in the ABI Dye 
Terminator Cycle Sequencing kit. 

Assessing the effiency of re corripina tion 

The number of colonies determines the efficiency of the opened 
vector and fragment recorrJ^ina t ion . The percentage of colonies with 
active lipase activity gives an estimate of the mixing of the 
active and inactive genes - theoretically it can be calculated for 
one frameshift that the closer to 50% the better mixing if equal 
likelihood of wild type and frameshift, 25% for 2 frameshifts and 
12.5% for 3 frameshifts. 

Frameshift mutation 

The frameshift mutation were created either by filling in a 
restiriction site (in case of 5' overhang) or deleting the "sticky 
ends" (in case of 3' overhang) by T4 DNA polymerase with or without 
dNTP (deoxynucleotides = equal amounts of dATP, dXTP, dCTP and 
dGTP) . Methods for filling in of restriction sites (referred to as 
"F" on Figure 7) and deleting the sticky ends (referred to as 
"(D)" on Figure 7) are well known in the art. 
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Method for assessing colonies with lipase activity 

The number of colonies and positives (i.e. with lipase activity) are 
calculated as the average of 3 plates. 

The cultivation condition and screening condition used is the 
following : 

1) Provide SC Ura-plates with a protein binding filter (Nvlon 
filter) onto the plate. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
lipase gene on the filter and incubate for 3 or 4 days at 30'*C. 

3) Remove the protein binding filter with the colonies to a petri 
dish containing: An agarose solution comprising an olive oil 
emulsion (2% PVA;01ive oil=2:l), Brilliant green (indicator , 0 , 004 %) , 
100 mM tris buffer pH 9. 

5) Identify colonies expressing lipase activity in the fonri of blue- 
green spots. 

EXAMPLES 

Example 1 

Testing in vivo recombination of two homologous genes 

The Saccharomyces ce'revisiae expression plasmid pJS02d was 

constructed as described above in the "Material and Methods"- 

section. 

A synthetic Humicola lanuginosa lipase gene (in pJS037) containing 
12 additional restriction sites (see figure 4) was cut with Nrul, 
PstI, and Nrul and Pstl, respectively, to open the gene 
approximately in the middle of the DNA sequence encoding the lipase. 

The opened plas mid (pvJS037) was transformed into Ssccharoiuyces 
cerevisiae YNG318 together with an about 0.9 kb wild-type Humicola 
lanuginosa lipase DNA fragment (see figure 1) prepared from pJS026 
by ?CR amplification. 

Further, the opened plasmid was also transformed into the yeast 
recombination host cell alone (i.e. without the 0.9 kb synthetic 
lipase DNA fragment) . 

The transform.ed yeast cells were grown as described in the "Ma- 
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teriais and Method"-section above, and the transformation frequency 
was deterniined as described above. 

It was found that the transformation frequency of the opened plasmid 
alone was very low {10 transf ormants per mg opened plasmid), in 
comparison to the transformation frequency of said plasmid/f ragment 
(50,000 transformants per mg opened plasmid). 

The plasmid/f ragment was PGR amplified resulting— transformants 
containing fragments covering the lipase gene region of the 
recombined plasmid/f ragments . The recombination mixture of the 20 
transformants were analyzed by restriction site digestion usinc 
standard methods. The result is displayed in Table 1. 

Table 1 

Nrul (not tested) 



PGR 


SphI 


Hindi I I 


PstI 


BstXI 


Nhl 


BstEII 


Kpnl 


Xhol 


fragment 


















PI 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


P2 


sg 


sg 


sg 


wt 


wt 


wt 


wt 


wt 


P3 


sg 


sg 


sg 


sg 


nd 


sg 


sg 


nd 


P4 


nd 


sg 


sg 


wt 


nd 


wt 


nd 


nd 


P5 


wt 


wt 


nd 


wt 


wt 


wt 


wt 


wt 


P6 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


Nl 


wt 


wt 


wt 


wt 


sg 


wt 


wt 


wt 


N2 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


N3 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


wt 


N4 


sg 


sg 


sg 


wt 


wt 


wt 


wt 


wt 


N5 


sg 


sg 


sg 


wt 


wt 


wt 


wt 


wt 


N6 


wt 


wt 


wt 


sg 


sg 


sg 


sg 


sg 


P/Nl 


sg 


sg 


sg 


wt 


wt 


wt 


wt 


wt 


P/N2 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N3 


sg 


sg 


sg 


wt 


nd 


sg 


sg 


sg 


P/N4 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N5 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N6 


sg 


sg 


sg 


wt 


nd 


sg 


sg 


sg 


P/N7 


nd 


wt 


wt 


wt 


nd 


wt 


nd 


wt 


P/N8 


sg 


sg 


sg 


wt 


wt 


wt 


sg 


nd 



P: plasmid opened with PstI 
N: Plasmid opened with NRuI 

P/N: plasmid opened with PstI and NRuI (resulting in the removal of 
a 75 bp fragment) 

wt: wild-type gene restriction enzyme pattern 
sg: synthetic gene restriction enzyme pattern 
nd: not determined 



As can bee seen from Table 1 10 transformants (equivalent to 50%) 
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contained recombined DNA sequences. 4 of these 10 DNA sequences 
(equivalent to 20%) contained either a region of the wild-iype gene 
recorribined into the synthetic gene or a region of the synthetic gene 
recombined into the wild-type fragment. 

Example 2 

In vivo recombination of Humicola lanuginosa lipase variants 

The DNA sequences of 20 variants of the Humicola 'Tanuginosa lipase 

were in vivo recombined in the same mixture. 

Six vectors were prepared from the" lipase variants (a) to (f) (see 
the list above) by ligation into the yeast expression vector pJS037. 
All vectors were cut open with Nrul . 

DNA fragment of all 20 homologous DNA sequences (g) to (aa) (see the 
list above) were prepared by PGR amplification using standard 
methods. 

The 20 DNA fragments and the 6 opened vectors were rj.xed and 
transformed into the yeast Saccharomyces cerevisiae T'JGllB bv 
standard methods. The recombination host cell was cultivated as 
described above and screened as described above. About 20 trans- 
formants were isolated and tested for improved wash performance 
using the filter assay method described in the "Material and 
Methods "-section . 

Two positive transf orrriants (named A and B) were identified using the 
filter assay. 

In comparison to the wild-type amino acid sequence the two re- 
combined positive transf ormants had the following mutations. 

A: D57G, N94K, D96L, P255T 

A is^a recombination of two variants. 
originates from the vector (d) 

===== originates from the DNA fragment prepared from variant (y) 

B: D57G, G59V, N94K, D96L, L97M, S116P, S170P, N249R 
???? <<<<< ????? ===== 

B is a recombination of vector (c) , DNA fragments (n) and (u) . 
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originates from the vector (c) 

<<<< originates from the DNA fragment prepared from variant (u) 
===== originates from the DNA fragment prepared from variant (n) 
???? Amino acid mutation which is not a result of recombination. 

As can be seen the resulting positive variants have been formed bv 
recombination two or more variants. The amino acid mutations marked 
*•?????" are not a result of in vivo recombination, as none of the 
shuffled lipase variants (see the list above) comprise any of said 
mutations. Consequently, these mutations are a result of random 
mutagenesis arisen during preparation of the DNA fragments bv 
standard PCR amplification. 

Example 3 

Recombination with one frameshift mutantions 

Synthetic Humicola lanuginosa lipase gene (in vector JS037) was 
made inactive at various positions by deleting (positions 184/385) 
or filling-in (position 290/317/518/74 6) restriction enzyme sites 
or by site-directed introduction of a stop codon. All inactive 
synthetic lipase genes of 900 bp can be deduced from Figure 7) 

A nuirJDer of different 900 bp DNA fragments were made from the above 
vectors using primer 4599 and primer 5164 using standard PCR 
technique. S.malier PCR fragn^.ents were made using primer 8487 and 
primer 4548 (260bp), primer 2843 and primer 4548 (488bp) . 

0.5 m.l (app. 0.1 mg) of vectors Blue 425, Blue 426, Blue- 428 and 
Blue 429, opened with Pst I (i.e. position 335), vectors Blue 424 
and Blue 425 opened with Nrul (i.e. position 464) were together 
with 3 ml (app. 0.5 mg) of fragments 424, 425, 426, 428, 429 in 
varios corrJoinat ion transformed into 100 m.l Sacchromyces cerevisiae 
YNG318 competent cells as displayed in Table lA. 

The number of colonies and positives (i.e. with lipase activity) 
were calculated as the average of 3 plates as described in the 
Material and Methods section. 

The result of the test is shown in Table lA 
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Table lA 



vector + Fragment 


Number of 
colonies 


% of colonies with active 
lipase activity 


1. Blue 428 -h 429o 


114 


16% 


2. Blue 429 f 428§ 


645 


3% 


3. Blue 426 + 425^ 


91^ 


oca 


4, Blue 425 + 426 


528 


„ .18% 


5. Blue 425/Nru I 
+ 426 


539 


28% 


6. Blue 425 -h 424 


139 


7% 


7. Blue 424/NruI + 
425a 


74 


32% 


8. Blue 428 + 425 


81 


12% 


9. Blue 428 -h wt 
f ragmen t 


317 


37% 



_ v^ii^ -u^c.u^^iixx. I- iuuuauioa on cne vector 

and another on the fragment on the opposite side of the oc^ninq 
site, n determined by 9 plates; # determined by 6 plates.'^ 

The first 2 rows of Table lA displays vectors and fragments with a 
frameshift on each side of the PstI site. The '^mirror imace" 
experiment in row 2 compared to row 1 gives a reproducible lower 
n-ombsr of active colonies. The same is true for row 3 and 4 even 
though it is not as pronounced. Moving the opening site closer to 
the frameshift in the vector increases the number of actives as 
seen m row 5. This can explain the reason for the difference in 
the "mirror image" experiments. In both cases the higher number of 
positives has the opening site closer to the frameshift in the 
vector . 



It can therefore be concluded that the closer the mutation is to 
the end of the vector the higher chance of mixing. This is probably 
arising from the well known fact that free DNA ends have a high 
recom±)inogenic potential. Therefore it is desirable to have as many 
free DNA ends as possible to increase the mixing of the genes. This 
is for example obtained in the later example with recombination of 
multiple overlapping fragments. 



Row 6 has a rather low number of actives probably due to the 
location of the frameshift on the fragment exactly at the PstI 
opening site of the vector. 
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Row 7 has the fraiueshift of the vector close to the opening site 
and again it gives a high number of actives. 



10 



15 



20 



25 



30 



Recombination with one stop codo n mutantions 

In order to test if there are any difference in the recombination 
efficiency of stop codon mutations compared to frameshift mutations 
the following experiments were made,. 

The same way as described above 0. 5 mi (app. CfrngT vectors Blue 
624, Blue 625 and Blue 626 (see Table IB) opened with PstI 
comprising stop codons at specified positions (positions 164, 317 
and 746, respectively) (perpared by site-directed mutagenesis) were 
together with 3 ml (app. 0.5 mg) of fragments 624, 625 and 626 
transformed into 100 ml Sacchromyces cerevisiae YNG318 competent 
cells m varies combination as displayed in Table IB. 



Table IB 



Vector + 
Fragment 


Number of 
colonies 


% of colonies with lipase 
activity 


1 . Blue 626 -f 
624 


ND 


40% 


2. Blue 624 + 
62 6 


ND 


12% 


3. Blue 625 ^ 
62 4 


ND 


75% 


4. Blue 624 
625 


ND 


10% 



Pairwise recorbmations or one stop coao.-i iuui,ciw.Luu u:i vt:uuu^ 
and another on the fragment on the opposite side of the opening 
site. ND = not determined but a high number. 

Row 1 and 2 (in Table IB) have the mutations located at the same 
place. as row 1 and 2 in Table lA. As can be seen the number of 
colonies with lipase activity is clearly higher for the stop codon 
mutations compared to the frameshift mutations, but the same 
relative difference between the ""irror image" experiments. 

This might indicate that the stop codon mutations, which is closer 
to the ^^application" of the method, gives a better mixing than 
frameshift mutations. Row 3 and 4 confirms that the closer the 
m.utation is to the end of the vector the higher chance of mixing. 



Recorrbination wSfh ono or rwo frameshift mutation in the vector 
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10 



and one or two frameshift muta tions in the fragment 

Using the same approach as described above the influence of one or 
two frameshift mutations in the vector and one or two frair.eshift 
mutations in the fragment were tested using vectors Blue 425, 426 
and 428 (one mutation) and vectors Blue 442, Blue 443 (two 
mutations) and fragments 442 and 443 (two frameshift mutations) 
and fragments 424, 425, 426, 427, 428 (one mutation) and wild-type 
(no mutation) 



The vectors Blue 442 and 443 are double frameshift mutations 
442 = 428 + 429 and blue 443 = 427 + 4-29 (see Figure 7). 



Blue 



Recombination was performed by transforming 0.5 ml vector (app. 0.1 
15 mg) opened with PstI and 3 ml ?C?.-f ragment (app. 0.5 mg) into 100 
ml Sacchromyces cerevisiae YMG318 competent cells. 

The result of the test is shown in Table 2A and Table 28 



20 



Table 2A 



25 



Vector + 
Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


1. Blue 425 + 
442 


142 


15% 


2. Blue 425 + 
443 


144 


14% 


3. Blue 426 + 
442 


42 


42% 


4 . Blue 426 + 
443# 


77 


20% 


5. Blue 428 + 
443 


115 


3.8% 



One frameshift mutation on tne vec^u^ auL. uu uw^ xxc 

each side of the opening site. # determined by 6 plates. 

Table 2B 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


Blue 442 + 424 


137 


0.5% 


Blue 442 + 426 


118 


1.1% 


Blue 442 + ^21r 


125 


1.3% 


Blue 443 + 425 


540 


2.5% : 
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Blue 443 + 426 



196 



1.5% 



Blue 443 + 428 



469 



3.11 



Blue 442 + wt 
fragment 



135 



7.7% 



Blue 443 + wt 
fragment 



488 



10.7% 



Two framesnitt mutations on the vector on each side ot rh^ nr^^r.- ' 
site and one on the fragment. # determined by 6 Stes. ^^^^^^^ 

Table 2A shows a rather high number of colonies with lipase 
activity even with a total of 3 frameshifts (but only one 
frameshift on the vector) except for .the last row where the 
frameshift on the vector is located far from the opening site. Lane 
4 has fewer actives than lane 3 probably due to that the frimeshift 
on the vector is located further away from the opening site than ^ 
the frameshift on the fragment making the active genes mosaics that 
are not related to the opening site (see figure 2A) . in Table 28 a 
very low number of actives are observed when there are 2 
frameshifts located on the vector. Most of these active colonies 
are mosaics of the -parent" DNA meaning that the mixing is not 
related to the opening site (see figure 2B) . 

RecomJpin ation with two different vectors or fraaT^Pn^.. 



The result of recombination with two different vectors or 
fragnments the test is shown m Table 3 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


Blue 428/pstI + 
Blue 429/pst ^ 


13 


15% 


Biue428/pst + Blue 
429/PstI + 442 


273 


4.2% 


..Blue 442/pstI + 428 + 
429 


228 


0.8% 


Blue 443/pstI + 427 + 
428 


229 


1.6% 


byT?U?J!''''' oifferent vectors or fragments. # Determined 



A low number of colonies are seen for the control experiment in row 
1 of table 3 as expected. The fragment added in the middle row has 
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two frcimeshifts each corresponding to the frameshift on each 
vector. Via a tripartite recombination 4.2% actives are created. 
With two fragments with each one frameshifts and a vector with the 
same two frameshifts very few actives are found. 

Recombination with vectors opened at different sites 

Opening the vector in one side instead of approximately in the 
middle still gives good recombination as shown in Table 4. Two 
vectors opened at different sites can also recombine to some extent 
(compare with the vector controls in table 13) . 



Table ^ 



Vector + Fragment 


Number of 


% of colonies with active 




colonies 


Lipolase 


Blue 428/xho + 429 


160 


11% 


Blue 428/xho+Blue 


35 


6.3% 


429/pst# 







Opening of the vector in one side instead of in the middle. # 
determined by 6 plates. 



Recombination at different concentrations of vector and fragment 

The relative concen ^ration of vector to fragment do influence the 
percentage of positive colonies as can be seen in Table 5. 



Table 5 



Vector + Fragment 


Number of 
colonies 


% of colonies with lipase 
activity 


O.Sjil Blue 426 + 
3^1 442 


42 


42% 


1.5)il Blue 426 + 
3ul 442 


21 


51% 


1.5ui Blue 426 + 
9^1 442 


34 


26% 


l.Sjil Blue 426 + 
3|il 427 


230 


2.8% 


1^1 Blue 442 + 1^1 
'425 


224 


1.16% 


l|il Blue 442 ^ 2ul 


429 


0.9% 
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425 






l^l Blue 442 + 4|il 
425 


434 


1. 61 


l^l Blue 442 + 8|il 
425 


481 


1.6% 


l^Ll Blue 442 + 16^1 
425 


497 


2.0% 



Varying the concentration of the vector or fragment. 



Recombination with fragments of different size 

The size of the fragment also influences the recombination result 
as seen in Table 6. 



Table 6 



Vector + Fragment 


Number of 
colonie s 


% of colonies with active 
Lipolase 


Blue 424 + 425 
(260bp) 


73 


34% 


Blue 424 + 425 
(489bp) 


130 


45% 


Blue 424 + 424 
(480bp) 


■ 133 


0.3% 


Blue 424 + 428 
(480bp) 


130 


36% 


Blue 428 + 425 
(480bp) 


150 


28% 


Blue 425 + 424 
(480bp) 


69 


0% 


Blue 425 + 428 
(480bp) 


63 


55% 



Recorri)ination with smaller fragments than 900 bp. 



Recombination with unopened vectors 

Transformation with unopened vectors shows a very low degree of 
recombination (Table 7). 



Table 7 



Plasmid 


Number of 
colonies 


% of colonies with active 
Lipolase 


Blue 428 ^ Blue 
429 


887 


0.3% 
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Blue 426 + Blue 


697 


0.7% 


425 







Recombination of unopened piasmids. 



Example 4 

Test of 5. cerevisiae mutants altered in recombination 

Using the same approach as described in Example 3 recombination of 
opened and unopened vectors and fragments were tested using a 
Saccharomyces cerevisiae rad52 mutant as the recombination host 
cell. The result is displayed in Table 8. 

Tabl-e 8 



Vector + 
Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


Blue 428 + 429 


0 


0 


Blue 442 + 427 


0 


0 


Blue 424 + 425 


0 


0 


Blue 426 + 443 


0 


0 


Plasmid pJSO 
37 


544 


lOOi 



Recombination result in rad52 mutant. 



The result with rad52' showed that re cor?±)i nation was completely 
abolished. The RAD52 function is required for classical 
recombination (but not for unequal sister-strand mitotic 
recor±)ination) showing that the recombination of opened vector and 
fragment could involve a classical recorrJoination mechanism. 

Example 6 

Recombination of multiple partial overlapping fragments 

In order to increase the mixing of the mutations by the 
recombination method of the invention, recombination of two 
fragments and one gapped vector were attempted. 



Table 15 



Vector + Fragment 


Number of 


% of colonies with lipase 




colonies 


• activity 


1. pJS037/HindIII-XhoI 


> 2000 


100% 


+ PCR319+PCR327 






2. pJS037/HindIII-XhoI 


2000 


^ 0.2% 


+ PCR321+PCR331 
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3. pJS037/HindIII-XhoI 
+ PCR319+PCR331 


= 1500 


^ 1% 


4. pJS037/HindIII-XhoI 
+ PCR319+PCR386 


> 5000 


> 90% 


5. pJS037/HindIII-XhoI 
•f PCR321 + PCR386 


> 5000 


^ 25% 


6. Blue 428/HindIII- 
Xhol + PCR321+PCR331 


400 


0.2% 


7. Blue 428/HindIII- 
Xhol + PCR319+PCR327 


1500 


~^-.T_.> 90% 


8. Blue 428/HindIII- 
Xhol + PCR321+PCR327 


150 


10% 


9. Blue 428/HindIII- 
Xhol + PCR327+PCR385 


1500' 


10% 


10. Blue 429/HindIII- 
Xhol + PCR319+PCR386 


=: 400 


= 15% 


11. Blue 429/HindIII- 
Xhol + PCR321+PCR386 


=: 350 


ft 15% 


12. Blue 442/HindIII- 
Xhol + PCR319+PCR327 


= 1500 


ft 10% 


13. Blue 428/HindIII- 
Xhol + 


2 


0 


14. Blue 429/HindIII- 
Xhol + 


0 


0 


15, Blue 442/HindIII- 
Xhol + 


6 


0 


16. Blue 428/HindIII- 
Xhol + PCR331 


4 


0 


17. Blue 428/HindIII- 
Xhol + PCR321 


2 


0 



Recombination result of two fragrrients and a gapped vector. The last 



5 rows are controls. 

As can be seen in Table 15, the recovery of the Humicola lanuginosa 
5 lipase gene is very efficient. The last 5 rows in Table 15 shows 

that the opened vector alone or with only one fragment not covering 
the whole gap (see figure 3) gives only very few colonies. 

The first row is with wild type fragments gives 100% of active 
10 colonies. 

The second row is with two fragments each containing a frameshift. 
The fragment PCR331 fragment has the frameshift located at the 
Bglll site which, in this recombination, is not covered by a wild 
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type fragment (see figure 3) and therefore gives about 0% of active 
lipase. The same is the case for row 3 and 6. 

In the row 4, fragment PCR386 containing a frameshift at the SphI 
5 site which is overlapped by wild type sequences in the gapped 
vector. The frameshift was recombined into less than 10% of the 
genes which is lower than the result for one fragment recombination 
in the last row of Table lA above. ^ 

10 In row 5 a rather high mixing is observed between the 2 fragments 

each containing a frameshift and the ^. wild type gapped vector giving 
25% active and 75% inactive lipase "colonies . This is probably due 
to that the fragment PCR321 has the frameshift in the overlap 
between the 2 fragments and in the gapped region of the vector. If 

15 fragment PCR386 contributes to 10% inactives like in row 4, 
fragment PCR321 gives the remaining 65% inactives - therefore 
PCR386 gives 35% wt in the overlap. 

Row 7 is the "mirror image" of row 4 with the frameshift at the 
20 SphI site on the vector (see Figure 7) and 2 wild type fragments 

giving an integration of the wild type fragment into more than 90% 
of the vectors. 

Row 6 shows like in row 5 that the frameshift Of PCR321 in the 
25 overlap and gap region gives a very high number of inactive. 

In row 9, fragment PCR385 with a frameshift in the vector overlap, 
causes a very high number of inactives. 

30 Row 10 gives a rather high- nuiriber of inactives compared to row 7 
and 4. It is not increased in row 11. 

Row 12 shows that two frameshifts on the vector gives a lower 
number of actives compared to one in row 7. 

35 

The recombination of 3 partial overlapping fragments into a gapped 
vector is also very efficient as seen in Table 16. The last row 
with the vector alone gives very few colonies. As can be seen in 
figure 4 all fragments used are wt . In the first row in table 16, 
40 there are rather long overlaps between the vector and fragments, 

but in the middle row the overlap between PCR353 and 355 is only 10 
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bp long and it is still very efficiently recombined! This 
surprising result may be utilized for very easy domain shuffling of 
even distantly related genes. For example can 3 different domains 
from 10 different genes be made as PGR fragments, designed to have 
a 10 to 20 bp overlap by primer design and recombined together and 
subsequently screened for the best combination (1000 possible 
combinations J . 



Table 16 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


pJS037/PvuII-SDeI + 
PCP.35 3 + ?CR354-rPCR367 


> 5000 


100% 


pJS037/PvuII-Soel + 
PCR35 3+PCR355+PCR3 67 ' 


> 5000 


100% 


pJS037/PvuII-SpeI 

Re COmh i np r i nn T-ocnlr r^ f 


20 


100% 



row is a control. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION; 

5 

(i) APPLICANT: 

(A) NAME: Novo Nordisk A/S 

(B) STREET: Novo Aiie 

(C) CITY: Bagsvaerd 
10 (E) COUNTRY: Denmark 

(F) POSTAL CODE (ZIP): DK-2880 

(G) TELEPHONE: +45 4444 8S88 

(H) TELEFAX: +45 4449 3256 

(ii) TITLE OF INVENTION: Method for preparing polypeptide variants 
15 (iii) NUMBER OF SEQUENCES: 15 

(iv) (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk * 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

20 (D) SOFTWARE: Patentin Release ?fl.O, Version ?*1.30B (EPO) 

(2} INFORM.A.TION FOR SEQ ID NO: 1: 

2 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
(3) TYPE: nucleic acid 

(C) STR-i^NDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Primer 2843" 
(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 1: 



35 



ACA--=.CATTA CGTGCACGGG 20 
(2) INFOR:-l.nTION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHAR-ACTERI STICS : 
(A) LENGTH: 13 base pairs 
40 (E) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4699" 
4 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CGGTACCCGG GGATCCAC IB 
{2} INFORMATION FOR SEQ ID NO: 3: 

50 

(i) SEQUENCE CHARACTERISTICS: 
.(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acid 
(C) STR.ANDEDNESS: single 
55 (D) TOPOLOGY: linear 

,(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 5164" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

60 AATTACATCA TGCGGCCC 18 



(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS; 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other n-cleic acid 

(A) DESCRIPTION: /desc = "Primer 8487" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



CATTTGCTCC GGCTGCAGGG A 21 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4548" 
(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 5: 



GTGTTCCGCC GGTCTGTACG GTCAGGA;i.TT CTGCAA^^J^GC 

CCTGTTTCCG ACTCGGGGGG 60 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sincle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desz - "Primer 5576" 
(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 6: 



GGTCTGTACG GTCAGGrJiLTT C 21 
(2) INFORMATION FOR SEQ ID NO: 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Primer 5578" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



CGTTTCGGGT GACGGGGAC 19 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 1596" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGhZC?J\ATG TCATTTAT 18 



(2) INFORMATION FOR SEQ ID NO: 9: 



10 



15 



50 
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{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4545" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCATTGGCAz^ CTGTTGCCGG AGCAGACCTG CGTGGAAATG 

GGTATGATAT CGACGTGTTT TCAT 64 



(2} INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Vector pJS026" 

2 5 (vi) ORIGINAL SOURCE: 

(B) STRAIN: Humicola lanuginosa 

(ix) FEATURE: 

(A) NP^XE/KEY: CDS 
30 (B) LOCATION: 1 876 

(xil SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

35 ATG AGG AGC TCC CTT GIG CTG TTC ITT GTC TCT GCG TGG ACG GCC TTG 4 8 

Met Arq Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
1 5 10 15 . 

GCC AGT CCT ATT CGT CGA GAG GTC TCG CAG GAT CTG TTT PJ\C CAG TTC 96 
4 0 Ala 5'=- Pro 1 1 A^c Arg Giu Val Ser Gin Aso Leu Phe Asn Gin Phe 
20 ' 25 30 ■ 

AAT CTG TTT GCA CAG TAT TCT GCA GCC GCA TAC TGC GGA AAA AAC AAT 144 
Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 
45 35 40 45 

GAT GCC CCA GCT GGT ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 192 
AsD Ala Pro Ala Giv Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro 
50 ' 55 60 



GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 24 0 

Glu Val Glu Lys Ala Aso Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser 
65 70 75 80 



5 5 GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTC GAC AA.C ACG AAC AAA 28 3 

Gly Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 

85 50 95 

TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAG AAC TGG ATC 336 

60 Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He 

100 105 • 110 

GGG AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC .GGC 38 4 

Gly Asn Leu Asn Phe Aso Leu Lys Glu He Asn Asp He Cys Ser Gly 

65 115 * 120 125 
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TGC AGG GGA CAT GAC GGC TTC ACT TCG TCC TGG AGG TCT GTA GCC GAT 4 32 

Cys Arg Gly His Asp Giy Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 
130 135 140 

ACG TTA AGG CAG AAG GTG GAG GAT GOT GTG AGG GAG CAT CCC GAC TAT 4 80 

Thr Leu Arg Gia Lys Val Glu Asd Ala Val Arg Glu His Pro Asd Tyr 
145 150 ' 155 ^ 160 

CGC GTG GTG TTT ACC GGA CAT AGC TIG GGT GGT GCA TTG GCA ACT GTT 528 
Arg Val Val Phe Thr Gly His Ser Leu Gly Gly .Ala Leu Ala Thr Val 
165 170 175 

GCC GGA GCA GAC CTG CGT GGA A.AT GGG TAT GAT ATC GAC GTG TTT TCA 57 6 

Ala Gly Ala Asd Leu Arg Gly Asn Gly Tyr Asp lie Asp -V-ai-_J»he Ser 
180 185 190 

TAT GGC GCC CCC CGA GTC GGA A-AC AGG GCT TTT GCA GAA TTC CTG ACC 624 
Tyr Gly Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
195 200 . ' 205 

GTA CAG ACC GGC GGA ACA CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT 672 
Val Gin Thr Gly Giy Thr Leu Tvr Arg lie Thr His Thr Asn Asd lie 
210 215 220 

GTC CCT AGA CTC CCG CCG CGC GA.A TTC GGT TAC AGC CAT TCT AGC CCA 720 
Val Pro Arg Leu Pro Pro Arg Glu Phe Giy Tyr Ser His Ser Ser Pro 
225 230 235 240 

GAG- TAC TGG ATC AA-A TCT GGA ACC CTT GTC CCC GTC ACC CGA A.AC GAT 7 68 

Giu Tyr Tro lie Lys Ser Giy Thr Leu Val Pro Val Thr Arg Asn Asd 
245 250 255 

ATC GTG A.AG ATA GAA GGC ATC GAT GCC ACC GGC GGC A.AT- A.AC CAG CCT 816 
lie Val Lvs lie Glu Giy lie Aso Ala Thr Giy Giy Asn Asn Gin Pro 
260 ' 265 270 

A-AC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG 8 64 

Asn lie Pro Aso lie Pro Ala His Leu Tro Tyr Phe Giy Leu lie Giy 
275 2B0 285 

ACA TGT CTT TAG 37 6 

Thr Cys Leu • 
2S0 



(2) ISrO?J-!.ATI0>3 rOR S£Q ID MO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 amino acids 

(B) TYPE: a:T\ino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: orotein 

(xi) SEQUESCE DESCRIPTION: SZQ ID NO: 11: 

Met Arg Ser Ser Leu Val Leu Phe the Val Ser A.ia Tro Thr Ala Leu 
1 5 10 ' 15 

Ala Ser Pro lie Ara Arc Giu Val Ser Gin Aso Leu Phe Asn Gin Phe 
20 25 30 

Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Giy Lys Asn Asn 
35 40 * 45 

Asp Ala Pro Ala Giv Thr Asn lie Thr Cys Thr Giy Asn Ala Cys Pro 
50 55 60 
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TV ^ Th^ Phe Leu Tyr Ser Phe Giu Asp Ser 
Glu Val Glu Lys Ala Asp Ala in. fc-ne i.cu y r- 

65 70 
Glv val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
85 

Leu lie val Leu Se. Phe Arg Gly Ser Arg Ser lie Glu Asn Trp He 

100 

Gly Asn Leu As. Phe Asp Lea Lys Glu He Asn Asp lie Cys Ser Gly 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Ar^ Ser_Val_Ala Asp 

130 

Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 

145 150 

Arc val val Phe Thr Gly His Ser Leu Giy^Gly Ala Leu Ala Thr Val 
165 

Ala Glv Ala Asp Leu Arg Gly Asn Gly Tyr Asp lie Asp Val Phe Ser 

. 180 

Tyr Gly Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 

val Gin Thr Gly Gly Thr Leu Tyr Arg lie Thr His Thr Asn Asp ire 

210 215 
val Pro Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 
225 230 23= 

ri„ Th- Leu Val Pro Val Thr Arg Asn Asp 
Glu Tyr Tro He Lys Ser Gly Thr Leu vdx r ^ 
245 250 



lie val Lys lie Glu Gly He As? Ala Thr Gly Gly Asn Asn Gin Pro 



260 



.sn lie Pro Aso He Fro Ala H.s Leu Trp Tyr Phe Gly Leu lie Gly 

* 280 



275 

Thr Cys Leu 
290 

(2) INFORM-iVTrON FOR SZQ ID NO: 12: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: other r.ucieic acid 

(A) DESCRIPTION: /desc - 'Vector pJS037 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: Hu^T.icola lanuginosa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .67 6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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20 



30 



40 



ATG AGG AGC TCC CTT GTG CTG TTC TTT GTC TCT GCG TGG ACG GCC TTG 
Met Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
15 10 15 



GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 
Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asd Ser 

70 . -75 ' eo 

GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTC GAC AAC ACG AAC AAG 
Gly Val Gly Asp Val Thr Gly Phe Leu Ala Lea Asp Asn Thr Asn Lvs 
85 90 95 ^ 



65 



ATC GTG AAG ATA GA.A GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT 
lie Val Lys lie Glu Gly lie Asd Ala Thr Gly Gly Asn Asn Gin Pro 
260 265 270 



48 



96 



5 GCC AGT CCT ATA CGT AGA GAG GTC TCG CAG GAT CTG TTT AAC CAG TTC 
Ala Ser Pro lie Arg Arg Giu Val Ser Gin Asp Leu Phe Asn Gin Phe 
20 25 30 

AAT CTC TTT GCA CAG TAT TCA GCT GCC GCA TAC TGC GGA AAA AAC AAT 144 
10 Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 
35 AO 45 

GAT GCC CCA GCA GGT ACA AAC ATT ACG TGC ACG GGA AAT GCA TGC GCC 
Asp Ala Pro Ala Gly Thr Asn lie Thr Cys Thr Gly Asn-Al^_Cys Pro 
lb SO 55 60 



192 



240 



288 



2 5 CTT ATC GTC CTC TCT TTC CGT GGC TCA AGA TCT ATA GAG AAC TGG ATC T^fi 
Leu lie Vai Leu Ser Phe Arg Giv Ser Arg Ser lie Giu Asn Tro Hp 
100 ' 105 110 ^ 

GGG A_AT CTT AJ^^C TTC GAC TTG PJ-Jk GAA ATA AAT GAC ATT TGC TCC GGC 38 4 

Gly Asn Leu Asn Phe Asp Leu Lys Giu lie Asn Asd He Cys Se- Giv 
115 120 ' 125 ' 

TGC AGG GGA CAT GAC GGC TTC ACT TCG TCC TGG AGG TCT GTA' GCC GAT 4 32 

t:; ^'^^ T'P Arg Ser Vai Ala Asd 

130 135 

ACG TTA AGG CAG AAG GTG GAG GAT GCT GTT CGC GAG CAT CCC GAC TAT 
Thr Leu Arg Gin Lys Vai Giu Asd Ala Val Arg Giu His Pro Asp Tvr 
150 155 ilo 

CGC GTG GTG TTT ACC GGC CAT A3C CTT GGT GGT GCG CTA GCA ACT GTT 
Arg Val Vai Pne Thr Gly His Ser Leu Gly Gly Ala Leu Ala Th- Vai 
165 170 175 

4 5 GCC GGA GCA GAC CTG CGT GGA A.nT GGG TAT GAT ATC GAC GTG TTT TCA 
Ala Gly Ala Asp Leu Arg Gly Asa Giv Tyr Asd lie Asp Val Phe Ser 
180 185 ' 190 

TAT GGC GCC CCC CGA GTC GGT A-nC CGT GCT TTT GCA GAA TTC CTG ACC 62 4 

bU Tyr Gly Ala Pro Arg Vai Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
195 200 205 

GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT G17 
Vai Gin Thr Gly Giy Thr Leu Tyr Arc: lie Thr Kis Thr Asn Asd lie 
210 215 220 

GTC CCT AGA CTC CCG CCT CGA G.A.A TTC GGT TAC AGC CAT TCT AGC CCA 7 20 

Val Pro Arg Leu Pro Pro Arg Giu Phe Giy Tyr Ser His Ser Ser Pro 
60 225 ^ 230 235 240 

GAG TAC TGG ATC A.A.A TCT GGA ACA CTA GTC CCC GTC ACC CGA AAC GAT 7 68 

Glu :yr Trp lie Lys Ser Gly Thr Leu Vai Pro Val Thr Arg Asn Asd 
245 250 255 



480 



528 



576 



616 
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20 



35 



47 



AAC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAG TTC GGG TTA ATT GGG 8 64 

Asn lie Pro Asd lie Pro Ala Kis Leu Trp Tyr Phe Gly Leu lie Gly 
275 ' 280 285 

ACA TGT CTT TAG 87 6 

Thr Cys Leu 
290 

(2) INFORMATION FOR SZQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 292 amino acids 
15 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID ^0: 13: 

Met Arg Ser Ser Leu Val 'Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
15 10 15 



Ala Se^ Pro lie Arc Arg Glu Val Ser Gin Asp Leu Phe Asn Gin Phe 

25 20 ' 25 30 

Asn Leu Phe Ala Gin Tvr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 

35 ' 40 45 

30 Asd Ala Pro Ala Gly Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro 

50 55 60 



Glu Val Glu Lys Ala Asd Ala Thr Phe Leu Tyr Ser Phe Glu' Asp Ser 

65 7*0 75 80 

Glv Val Gly Asp Val Thr Glv Phe Leu Ala Leu Asd Asn Thr Asn Lvs 

85 90 95 ■ 



Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He 

40 100 105 110 

Glv Asn Leu Asn Phe Asd Leu Lvs Glu lie Asn Asp He Cys Ser Gly 
115 ' 120 • 125 

45 Cys Arg Gly His Asd Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 
130 ' 135 140 



50 



Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 
145 150 155 160 

Arc Val Val Phe Thr Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val 

165 170 175 



Ala Gly Ala Asd Leu Arg Glv Asn Gly Tyr Asp He Asp Val Phe Ser 
55 180 185 190 

Tyr Gly Ala Pro Aro Val G^lv Asn Ara Aia Phe Ala Glu Phe Leu Thr 
195 ' " 200 205 

60 Val Gin Thr Gly Glv Thr Leu Tyr Arg He Thr His Thr Asn Asp He 
210 ^ 215 220 



65 



Val Pro Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 

225 230 235 240 

Glu Tyr Tro He Lvs Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 

245 250 255 
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lie Vai Lys lie Giu Gly lie Asp Ala Thr Gly Giv Asn Asn Gin Pro 
260 265 270 

Asn lie Pro Aso He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly 
275 ' 280 285 

Thr Cys Leu * 
290 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 864 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic)' 

(vi) ORIGINAL SOURCE: 

(3) STRAIN: Pseudomonas sp. 

(ix) FEATURE: 

(A) NAME/KEY: mat_Deotide 

(B) LOCATION:!. .8 64 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(3) LOCATION: 1 864 



(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTC GGC TCC TCG AAC TAC ACC PAG ACC CAG TAG CCG ATC GTC CTG ACC 4 8 

Phe Giv Ser Ser Asn Tyr Thr L%'s Thr Gin Tyr Pro He Val Leu Th- 
1 5 ' 10 15 

CAC GGC ATG CTC GGT TTC GAC AGC CTG CTT GGA GTC GAC TAC TGG TAC 96 
His Giv Met Leu Giv Phe Aso Ser Leu Leu Giv Val Asp Tyr Trp Tyr 
20 ' ' 25 ' 30 * 

GGC ATT CCC TCA GCC CTG CGT GAC GGC GCC ACC GTC TAC GTC ACC 14 4 

Gly lie Pro Ser Ala Leu Arg Lys Asp Gly Ala Thr Val Tyr Vai Thr 
35 40 45 

G."-^ GTC AGC CAG CTC GAC ACC TCC GhA GCC CGA GGT GAG CAA CTG CTG 192 
Giu Val Ser Gin Leu Aso Thr Ser Giu Ala Arg Gly Glu Gin Leu Leu 
50 55 60 

ACC CPkA GTC GAG GAA ATC GTG GCC ATC AGC GGC P^.G CCC AAG GTC AAC 24 0 

Thr Gin Val Glu Glu He Val Ala He Ser Giv Lys Pro Lys Vai Asn 
65 70 75 80 

CTG TTC GGC CAC AGC CAT GGC GGG CCT ACC ATC CGC TAC GTT GCC GCC 288 
Leu Phe Giv His Ser His Gly Giy Pro Thr lie Arg Tyr Val Ala Ala 
85 90 95 

GTG GGC CCG GAT CTG GTC GCC TCG GTC ACC AGC ATT GGC GCG CCG CAC 336 
Val Arg Pro Aso Leu Val Ala Ser Vai Thr Ser He Giy Ala Pro His 
100 105 110 

A.^G GGT TCG GCC ACC GCC GAC TTC ATC CGC CAG GTG CCG GAA GGA TCG 38 4 

Lvs Giv Ser Ala Thr Ala Asp Phe He Arg Gin Val Pro Giu Gly Ser 
115 120 125 
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40 



50 



60 



rrr nrr GAA GCG ATT CTG GCC GGG ATC GTC AAT GGT CTG GGT GCG CTG 4 32 

III sir gK Mt lie Leu Ma Giy He Val Asn Gly Leu Gly Ala Leu 
130 135 

ATC AAC TTC CTT TCC GGC AGC AGT TCG GAC ACC CCA CAG AAC TCG CTG 4 80 

lie Phe Leu Ser Giy Ser Ser Ser Asp Thr Pro Gla Asn Ser Leu 



145 150 
^.^ P-,^^ CTG AAC TCC GAA GGC GCC GCA CGG TTT AAC GCC 

g ^ III Glu ser Leu Ser Glu Gly Ala Ala Arg Phe Asn AU 



10 Giy Thr Leu Glu Ser Leu Asn ^er u^j 
165 1'^ 

GTA CCA ACC 
Vai Pro Thr 

15 180 



rrr r-C CCC CAG GGG GTA CCA ACC AGC GCC TGC GGC GAG_,aGC^GAT TAC 

S Phe Pro Gin Gl'y vll Pro Thr Ser Ala Cys Gly Glu ^ly Asp Tyr 
180 

.nT rrr rrr: rr;C TAT TAC TCC TGG AGG GGC ACC AGC CCG CTG 

V.l ITy "Ti 1% rfr Ser Trp.Arg Gi, Thr S.r Pro Leu 

195 200 • 205 

ACC A..C GTA CTC GAC CCC TCC GAC CTG CTG CTC GGC GCC ACC TCC CTG 

Thr Asn Val Leu Asp Pro Ser Asp Leu Leu Leu Gly Ala T... Ser Leu 

210 215 220 

,c T-- rrr ttp gag GCC AAC GAT GGT CTG GTC GGA CGC TGC AGC TCC 

Thr Phe 1% Phe III Ala Asp Gly Leu Val Gly Arg Cys Ser Ser 
225 230 235 

rrr GGT ATG GTG ATC CGC GAC AAC TAC CGG ATG A.AC CAC CTG GAC 

30 Sg L^u Sy Met vll lie Arg Asp Asn Tyr Arg Met Asn Kis Leu Asp 

245 

GAG G-G A.AC CAG ACC TTC GGG CTG ACC AGC ATC TTC GAG ACC AGC CCG 

Glu Vai Asa Gin Thr Phe Gly Leu Thr Ser He Phe G.u Tn Se. Pro 
35 260 265 

pcC CAG CAA GCC AAT CGC CTG A.AG AAC GCC GGG CTC 

vll s:r vll ?yr Arg Gin Gin Ala Asn Arg Leu Lys Asn Ala Gly Leu 

275 280 28!) 



(2) INrOR:-l.=iTION FOR SEQ ID NO: 15 



(i) SEQUEMCE CKAH-^CTHiRISTICS : 
45 (A) LENGTH: 288 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 
{ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: S^Q ID NO; 15. 

Phe Gly Ser Ser Asn Tyr Thr Lys Thr Gin Tyr Pro lie Val Leu Thr 

Hxs Gly Met Leu Gly Phe Asp Ser Leu Leu Giy Val Asp Tyr Trp Tyr 
55 20 2. 

Glv lie Pro Ser Ala Leu Arg Lys Asp Gly Ala Thr Val Tyr Val Thr 
35 

Glu val Ser Gin Leu Asp Thr Ser Glu Aia Arg Gly Glu Gin Leu Leu 
30 55 60 

Thr Gin Val Glu Glu He Val Ala He Ser Gly Lys Pro Lys Val Asn 

65 "^0 

Leu Phe Gly His Ser H.s Gly Giy Pro Thr lie Arg Tyr Val Ala Ala 
85 



528 



576 



624 



720 



768 



816 



864 
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val Arg Pro Asp Leu Val Ala Ser Val Thr Ser lie Gly Ala Pro His 
100 

Lys Gly ser Ala Thr Ala As? Phe lie Arg Gin Val Pro Glu Gly Ser 
115 120 

Ala Ser Glu Ala lie Leu Ala Gly He Val Asn Gly Leu Gly Ala Leu 
130 135 110 

lie Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 
145 150 155 160 

Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Pne Asn Ala 
15 165 1''0 1' = 

Arg Phe Pro Gin Gly Val Pro Thr Ser Ala Cys Gly Glu Gly Asp Tyr 
180 185 , -^'^ 

20 val val Asn Glv Val Arg Tyr Tyr Ser Trp Arg Gly Thr Ser Pro Leu 
195 ' 200 205 

Thr Asn Val Leu Asp Pro Ser Asp Leu Leu Leu Gly Ala Thr Ser Leu 
210 215 220 

25 

235 240 



Thr Phe Gly Phe Glu Ala Asn Asp Gly Leu Val Gly Arg Cys Ser Ser 
225 230 235 -iiu 

Arg Leu Gly Met Val He Arg Asp Asn Tyr Arg Met Asn His Leu Asp 
30 245 2=0 ^3= 

Glu Val Asn Gin Thr Phe Gly Leu Thr Ser He Phe Glu Thr Ser Pro 
260 265 270 

35 Val Ser Val Tyr Arg Gin Gin Ala Asn Arg Leu Lys Asn Ala Gly Leu 
275 280 235 
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PATENT CLAIMS 

1. A method for preparing polypeptide variants by shuffling 
different nucleotide sequences of homologous DNA sequences by ir. 
vivo recombination comprising the steps of 

a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid (s) within the DNA sequence (s) 
encoding the polypeptide (s) , 

c) preparing at least one DNA fragroent comprising a DNA sequence 
homologous to at least a part of tfie polypeptide coding region on at 
least one of the circular plasmid (s), d) introducing at least one 
of said ooened plasmid(s), together with at least one of said 
homologous' DNA fragment (s) covering full-length DNA sequences 
encoding said polypeptide {s ) or parts thereof, into a recorMnatior. 
host cell, 

e) cultivating said recombination host cell, and 

f) screening for positive polypeptide variants. 

2. The method according to claim 1, wherein more than one cycle of 
step a) to f) are performed. 

3. The method according to claims 1 and 2, wherein two or more 
opened plasmids are shuffled with one or more homologous DNA 
fragments in the sane shuffling cycle. 

4. The method according to any of claims 1 to 3, wherein the opened 
plasmid(s) is (are) gapped. 

5. The method according to any of claims 1 to 4 wherein the ratio 
between the opened plasmid(s) and homologous DNA fragment(s) are in 
the range from 20:1 to 1:50, preferable from 2:1 to 1:10 (mol 
vector:mol fragsents) with the specific concentrations being from 1 
pM to 10 M of the DNA. 

6. The method according to any claims 1 to 5, wherein 2 or more, 
preferably from 2 to 6, especially 2 to 4 of the DNA fragments have 
partially overlapping regions. 
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7. The method according to claim 6, wherein the overlapping regions 
of the DNA fragments lies in the range from 5 to 5000 bp, preferably 
from 10 bp to 500 bp, especially 10 bp to 100 bp. 

8. The method according to any of claims 1 and 8, wherein at 1 
one cycle of step a) to f) is backcrossing with the initially Tsll 
DNA fragments. ^ "^^"^ 

9. The method according to any of claims 1 "and- 8, wherein the 
Plasmid(s) is (are) opened in the region around the middle of the DNP 
sequence (s) encoding the polypeptide (s) . 



10. The method according to any of claims 1 to 9, wherein the 
_ Plasm.d(s) is(are) opened close to a mutation in the DNA seauence(s) 
5 encoding the polypeptide (s) . 

11. The method according to any of claims 1 to 10, wherein the DNA 
fragment (s, prepared in step c) is (are, prepared under conditions 
suitaole for high, medium or low mutagenesis. 

12. The method according to any of claims 1 to 11, wherein the 
polypeptides producible from the input DNA sequences are enzymes or 
proteins with biological activity. 

13. The method according to claim 12, wherein t.he polypeotides ar» 
enzymes selected from the group including proteases," lipas^s^ 
cuti.nases, cellulases, a.-.ylases, peroxidases, oxidases and phytases.' 

14. The method according to claim 12, wherein the polypeptides a^e 
proteins with biological activity selected from the grouu including 
.nsulin, ACTH, glucagon, somatostatin, somatotropin," .h^osin! 
parathyroid hormone, pigmentary hormones, somatomedin, erythro- 
poietin, luteinizing hormone, chorionic gonadotropin, hypothalamic 
releasing factors, antidiuretic horr.ones, thyroid stimulating 
normone, relaxin, interferon, throrriopoietin (TPO) and prolactin. 

15. The. method according to any of claims 1 to 13, wherein at least 
one Of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi, such as Hu.icola 
sp.. m particular Hamlcola lanuginosa, especially Humicola 
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lanuginosa. DSM 4109. 



16. The method according to claim 15, wherein at least one of th 
input DNA sequences is selected from the group of vectors (a) to (f 
5 and/or DNA fraginents (g) to (aa) coding for Hunlcola lanuglnos 
lipase variants. 



17. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a Vild-typ^ DNA 
sequence, such as a DNA sequence coding for wild-type enzyx-s in 
particular lipases, derived from filamentous fungi of the g-nera 
Ahsidia, Rhizopus, Emericella, Aspergillus, PenicllHum, 
EupenicilUum, Paecilomyces, Talaromyces. Thermoascus and 
Sclerocleista . 

18. The method according to any of claims 1 and 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DN" 
sequence, such as a DNA sequence coding for wild-type enzymes i^' 
particular lipases, derived from bacteria, such as Pseudo.onas sp " 
m particular Ps. fragi, Ps. stutzeri, Ps. cepacia. Ps. fluorescens 
PS. plantarll. Ps. gladioli. Ps. alcaligenes, Ps . pseudoalcaligenes' 
PS. mendocina, Ps . auroginosa, Ps . glusnae, Ps. syringa^ o/ 
-isconsinensis, or a strain of Bacillus sp., in particular's.' 
subtilis, 3. stearochermophilus or or B. pumilus. or or a strain o- 
Streptomyces sp., .n particular S. scabies, or a strain o-" 
Cnro.'^iooacteriu.T! sp. m particular C. viscosum. 

19. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a variant DN;^ 
sequence, such as a DNA sequence coding for a variant enzyme ir 
particular lipase variants, derived from yeasts, such as Candida 
sp., m particular Candida rugosa, or Geotrichun: sp. , in particular 
Geotrichui:] candidum, 

20. The method according to ' any of claims 1 to IS, wherein th- 
homologous input DNA sequences are at least 601, preferably at least 
10%. better more than 80%, especially more than 90%, and even ud to 
100% homologous. 



21. The 



method according to any of clams 1 to 20, wherein the 
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recombination host cell is a eukaryotic cell, such as a fungal cell 
or a plant cell . 

22. The method according co claim 21, wherein said fungal cell is a 
yeast cell from the group of cell of Saccharomycss sp., in 
particular strains of Saccharowyces cerevisiae or Saccharomyces 
kluyverl or Schizosaccharomyces sp., in particular 
Schizosaccharomyces pombe. or Kluyvsromyces sp., such as K. lactis. 
or Hansenula sp., in particular H. polymorpha. -^^-^ichia sp., in 
particular P. pastoris, or a filamentous fungi from the group of 
Aspergiiiu5 sp., in particular A. niger, A. nidulans or A. oryzae, 
or Weurospora sp., or FusariL-.n sp.,. In particular F. oxysporum, or 
Trichoderma sp . . 

23. The method according to any of claims 1 to 22, wherein the 
plasmid DNA sequence (s) coding for the polypeptide (s) is (are) 
operably linked to a replication sequence. 

24. The method according to claim 23, wherein the plasmid DNA 
sequence (s) encoding the polypeptide ( s ) is (are) operably linked to a 
functional promoter sequence. 

25. The method according to claim 24, wherein the plasmid is an 
expression plasmid. 

2c. The method according to claim 25, wherein the e.xpression 
plas-id is pJS026 or pJS037. 

Title; Method for preparing polypeptide variants 
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ATGAGGAGCTCCCTTGTGCTGTTCTTTGTCTCTGCGTGGACGGCCTTGGCCAGTCCTATT 

5447 + + + + + 5506 

1 MRSSLVLFFVSAWTALASPI 

CGTCGAGAGGTCTCGCAGGATCTGTTTA.ACCAGTTCAATCTCTTTGCACAGTATTCTGCA 

5507 — + + " 55^^ 

21 RREVSQDLFNQFNLrAQYSA 

GCCGCATACTGCGGAAAAAACAATGATGCCCCAGCTGGTACAAACATTACGTGCACGGGA 

5567 — . + ^ ^ ^ * 5626 

41 AAYCGKNNDAPAGTNITCTG 



AATGCCTGCCCCGAGGTAGAGAAGGCGGATGCAACGTTTCTCTACTCCITTGAAGACTCT 

5627 - — ^ + ^ 

61 NACPEVEKA DATFLYSFEDS 



GGAGTGGGCGATGTCACCGGCTTCCTTGCTCTCGACAACACGAACAAATTGATCGTCCTC 

5687 - — + + -----^ * + 5746 

81 GVGDVTGFLALDNTNKLIVL 

'^CTTTCCGTGGCTCTCGTTCCATAGAGA-^-CTGGATCGGG.--^TCTTA.ACTTCGACTTG--i^ 

^7^7 ^ -^ ^ ^ 5306 

ICl S ■ F R G S R S I E N W I G N L N F D L K 

GAAATA.AATGACATTTGCTCCGGCTGCAGGGGACATGACGGCTTCACTTCGTCCTGGAGG 

5807 * + * ^ ^ + 5866 

121 EiNDlCSGCRGHDGFTSSWR 

TCTGTAGCCGATACGTTAAGGCAG;^-AGGTGGAGGATGCTG7GAGGGAGCATCCCGACTAT 

5367 — ^ + + 5926 

141 SVADTLRQKVEDAVREHPDY 

CGCGTGGTGTTTACCGGACATAGCTTGGGTGGTGCr.7TGGC;iJ^CTGTTGCCGGAGCAGAC 

5927 — ^ + + ^ 5986 

161 RVVFTGHSLGGALATVAGAD 

CTGCGTGGA.A.ATGGGTATGATATCGACGTGTTTTCATATGGCGCCCCCCGAGTCGGr.-^C 
5967 " • 6046 

isl lrgngydidvfsygaprvgn 
agggct7ttgcag;^attcctgaccgtacagaccgg:ggp_-.cactctaccgcattacccac 

6047 ^ ^ 6106 

201RAFAEFLTVQTGGTLYRITH 

ACCA.ATGATATTGTCCCTAGACTCCCGCCGCGCGA-.TTCGGTTACAGCCATTCTAGCCCA 

6107 — . + ^ ^ ^ + 6166 

221 TNDIVPRLPPREFGYSHSS? 

GAGTACTGGATCAJ^ATCTGGA;^.CCCTTGTCCCCGTCACCCGA.AACGATATCGTGAJi.GATA 

6167 - — + + + 6226 

241 rywiKSGTLVPVTRNDIVKI 

G.AAGGCATCGATGCCACCGGCGGCA-AT;-ACCAGCC7A-rATTCCGGATATCCCTGCGCAC 

6227 + ^ 6286 

261 r G I D A T G G N N Q ? N I P D I P A H 

CTATGGTACTTCGGGTTA.ATTGGGACATGTCTTTAG 

6287 + ^ 6322 

261 L W Y F G L I G T C L ' 



Fig. 1 
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ATGAGGAGCTCCCTTGTGCTGTTCTTTGTCTCTGCGTGGACGGCCTTGGCCAGTCCTATA 

5447 . — + — + - 5506 

1 MRSSLVLFFVSAWTALASPI 

SnaBI 

CGTAGAGAGGTCTCGCAGGATCTGTTTAACCAGTTCfJ^TCTCTTTGCACAGTATTCAGCT 

5507 — +- — T + + + + 5566 

21 RREVSQDLFNQ FNLFAQYSA 

GCCGCATACTGCGGAAAAAACAATGATGCCCCAGCAGGTACAAACATTACGTGCACGGGA 

^-g-^ ^ > — — — — — — 5626 

4 1 V"a y"c"G KNNDAPAGTNITCTG 
Sohl 

AATGCATGCCCCGAGGTAGAGAAGGCGGATGCAACGTTTCTCTACTG3TTTGAAGACTCT 

5g27 — ^ + + + 5^2^ 

61 NACPEVEKADATFLYSFEDS - 

Hindi I I 

GGAGTGGGCGATGTCACCGGCTTCCTTGCTCTCGACAACACGAACAAGCTTATCGTCCTC 

5537 + * + ~ + + * 5746 

Q^ GVGDVTGFLA'LDNTNKLIVL 

Bglll 

TCTTTCCGTGGCTCAAGATCTATAGAGA.nC i GGATCGGGAATCTT.-.-.CTTCGACT i G.--A.n 

^ + 5806 

lOi S-FRGSRSIENWIGNLNFDLK 

GAAATA.a.ATGACATTTGCTCCGGCTGCAGGGGACATGACGGCTTCACTTCGTCCTGGAGG 

^-Q-^ ^ ^ ^ + + + 586o 

121 "e""i"n D ICSGCRGHDGrTSSWR 

Nrul 

TCTGTAGCCGATACGTTAAGGCAGP--.GGTGGAGGATGCTGTTCGCGAGCATCCCGACTAT 

5gg7 * -r 5926 

14 1 SVAD TLRQKVEDAVREHPDY 

BstXI Nhel 
CGCGTGGTGTTTACCGGCCATAGCCTTGGTGGTGCGCTAGCAACTGTTGCCGGAGCAGAC 

5527 — + * 59B6 

16^ RVVFTGHSLGGALATVAGAD 

BstEII 

CTGCGTGGA.--:^.TGGGTATGATATCGACGTGTTTTCATATGGCGCCCCCCGAGTCGGTAAC 

^9 = 7 ^ ^- ^0^6 

".Ji : D G N G Y D I D V r S Y G A ? R V G N 

KonI 

CGTGCTTTTGCAGAATTCCTGACCGrACAGACCGGCGbxACCGTCTACCGCATTACCCA: 
gQ.^ ^ * * 610d 

%oi "r'T'f aefltvqtggtlyrith - 

Xhol 

ACCAATGATATTGTCCCiAGACTCCCGCCTCGAG;^J\TTCGGTTACAGCCATTCTAGCCCA 

exo7 ^ " ^ ^^^"^ 

221 TNDIVPRLPPREFGYSHSS? 

Soel 

GAGTACTGGATCA.^ATCTGG.2u^CACTAGTCCCCGTCACCCGAAACGATATCGTGAAGATA 

VTg7 ^ ^ ^ * + 6226 

241 E YWIKSGTLVPVTRNDIVKI - 



G.^AGGCATCGATGCCACCGGCGGC AATP-.^CCAGCCTAACATTCCGGATATCCCTGCGCAC 

6227 - — + " " ^2^° 

261 E G I D A T G G N N Q P N I ? D I P A H - 

CTATGGTACTTCGGGTTAJMTGGGACAT3TCTTTAG 

62S7 ■^ ^ ^322 

2S1 LWYFGLIGTCL* 



Fig. 2 
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